Machine learning for algorithmic trading: predictive models to extract signals from market and alternative data for systematic trading strategies with Python
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Birmingham ; Mumbai
Packt
July 2020
|
Ausgabe: | Second edition |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xii, 790 Seiten Illustrationen, Diagramme |
ISBN: | 9781839217715 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV046950335 | ||
003 | DE-604 | ||
005 | 20230124 | ||
007 | t | ||
008 | 201020s2020 a||| |||| 00||| eng d | ||
020 | |a 9781839217715 |9 978-1-83921-771-5 | ||
035 | |a (OCoLC)1220895167 | ||
035 | |a (DE-599)HBZHT020578539 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-573 |a DE-1043 |a DE-739 |a DE-19 | ||
084 | |a QH 500 |0 (DE-625)141607: |2 rvk | ||
084 | |a QK 620 |0 (DE-625)141668: |2 rvk | ||
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
100 | 1 | |a Jansen, Stefan |e Verfasser |0 (DE-588)143872338 |4 aut | |
245 | 1 | 0 | |a Machine learning for algorithmic trading |b predictive models to extract signals from market and alternative data for systematic trading strategies with Python |c Stefan Jansen |
250 | |a Second edition | ||
264 | 1 | |a Birmingham ; Mumbai |b Packt |c July 2020 | |
264 | 4 | |c © 2020 | |
300 | |a xii, 790 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Wertpapierhandelssystem |0 (DE-588)4510686-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Algorithmus |0 (DE-588)4001183-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Wertpapierhandelssystem |0 (DE-588)4510686-1 |D s |
689 | 0 | 1 | |a Algorithmus |0 (DE-588)4001183-5 |D s |
689 | 0 | 2 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | 3 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032358867&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-032358867 |
Datensatz im Suchindex
_version_ | 1804181861575950336 |
---|---|
adam_text | Table of Contents Preface Chapter 1: Machine Learning for Trading - From Idea to Execution The rise of ML in the investment industry From electronic to high-frequency trading Factor investing and smart beta funds Algorithmic pioneers outperform humans ML and alternative data Crowdsourcing trading algorithms Designing and executing an ML-driven strategy Sourcing and managing data From alpha factor research to portfolio management Strategy backtesting ML for trading - strategies and use cases The evolution of algorithmic strategies Use cases of ML for trading Summary Chapter 2: Market and Fundamental Data - Sources and Techniques xiii 1 2 3 5 7 10 11 12 13 13 15 15 15 16 19 21 Market data reflects its environment 22 Market microstructure - the nuts and bolts 23 How to trade - different types of orders 23 Where to trade - from exchanges to dark pools 24 Working with high-frequency data 26 How to work with Nasdaq order book data 26 Communicating trades with the FIX protocol 27 The Nasdaq TotalView-ITCH data feed 27 From ticks to bars - how to regularize market data 35 AlgoSeek minute bars - equity quote and trade data 40 API access to market data 44 Remote data access using pandas 44 yfinance - scraping data from Yahoo! Finance 46 ------------------------------------------------------- [¡J ------------------------ :----------------------------
Table of Contents Quantopian Zipline Quandi Other market data providers How to work with fundamental data Financial statement data Other fundamental data sources Efficient data storage with pandas Summary Chapter 3: Alternative Data for Finance - Categories and Use Cases The alternative data revolution Sources of alternative data Individuals Business processes Sensors Criteria for evaluating alternative data Quality of the signal content Quality of the data Technical aspects The market for alternative data Data providers and use cases Working with alternative data Scraping OpenTable data Scraping and parsing earnings call transcripts Summary Chapter 4: Financial Feature Engineering - How to Research Alpha Factors Alpha factors in practice - from data to signals Building on decades of factor research Momentum and sentiment - the trend is your friend Value factors - hunting fundamental bargains Volatility and size anomalies Quality factors for quantitative investing Engineering alpha factors that predict returns How to engineer factors using pandas and NumPy How to use TA-Lib to create technical alpha factors Denoising alpha factors with the Kalman filter How to preprocess your noisy signals using wavelets From signals to trades - Zipline for backtests How to backtest a single-factor strategy Combining factors from diverse data sources Separating signal from noise with Alphalens Creating forward returns and factor quantiles Predictive performance by factor quantiles 48 48 50 50 51 51 56 57 58 59 60 62 62 63 63 65 65 67 68 69 70 72 72 77 80 81 82 84 84 88 90 92 94 94 99 100
104 106 106 109 111 112 113 ---------------------------------------------------- [н] --------------------------------------------------------
Table of Contents The information coefficient Factor turnover Alpha factor resources Alternative algorithmic trading libraries Summary Chapter 5: Portfolio Optimization and Performance Evaluation How to measure portfolio performance Capturing risk-return trade-offs in a single number The fundamental law of active management How to manage portfolio risk and return The evolution of modern portfolio management Mean-variance optimization Alternatives to mean-variance optimization Risk parity Risk factor investment Hierarchical risk parity Trading and managing portfolios with Zipline Scheduling signal generation and trade execution Implementing mean-variance portfolio optimization Measuring backtest performance with pyfolio Creating the returns and benchmark inputs Walk-forward testing - out-of-sample returns Summary Chapter 6: The Machine Learning Process How machine learning from data works The challenge - matching the algorithm to the task Supervised learning - teaching by example Unsupervised learning - uncovering useful patterns Reinforcement learning - learning by trial and error The machine learning workflow Basic walkthrough - к-nearest neighbors Framing the problem - from goals to metrics Collecting and preparing the data Exploring, extracting, and engineering features Selecting an ML algorithm Design and tune the model How to select a model using cross-validation How to implement cross-validation in Python Challenges with cross-validation in finance Parameter tuning with scikit-learn and Yellowbrick Summary Chapter 7: Linear Models - From Risk Factors to Return
Forecasts From inference to prediction 115 117 118 118 119 121 122 122 124 125 125 127 131 134 135 135 136 137 138 140 141 142 146 147 148 149 149 150 152 153 154 154 160 160 162 162 165 166 168 170 172 173 174 --------------------------------------------------- (այ--------------------------------------------------
Table of Contents The baseline model - multiple linear regression How to formulate the model How to train the model The Gauss-Markov theorem How to conduct statistical inference How to diagnose and remedy problems How to run linear regression in practice OLS with statsmodels Stochastic gradient descent with sklearn How to build a linear factor model From the CAPM to the Fama-French factor models Obtaining the risk factors Fama-Macbeth regression Regularizing linear regression using shrinkage How to hedge against overfitting How ridge regression works How lasso regression works How to predict returns with linear regression Preparing model features and forward returns Linear OLS regression using statsmodels Linear regression using scikit-learn Ridge regression using scikit-learn Lasso regression using sklearn Comparing the quality of the predictive signals Linear classification The logistic regression model How to conduct inference with statsmodels Predicting price movements with logistic regression Summary Chapter 8: The ML4T Workflow From Model to Strategy Backtesting How to backtest an ML-driven strategy Backtesting pitfalls and how to avoidthem Getting the data right Getting the simulation right Getting the statistics right How a backtesting engine works Vectorized versus event-driven backtesting Key implementation aspects backtrader - a flexible tool for local backtests Key concepts of backtrader s Cerebro architecture How to use backtrader in practice backtrader summary and next steps Zipline - scalable backtesting by Quantopian 175 175 176 179 180 181 184 184 186 187
188 189 191 194 194 195 196 197 197 203 205 208 210 212 212 213 215 217 219 221 222 223 224 225 226 227 228 230 232 232 235 239 239 ---------------------------------------------------- [iv]---------------------------------------------------------
Table of Contents Calendars and the Pipeline for robust simulations Ingesting your own bundles with minute data The Pipeline API - backtesting an ML signal How to train a model during the backtest Instead of How to use Summary Chapter 9: Time-Series Models for Volatility Forecasts and Statistical Arbitrage Tools for diagnostics and feature extraction How to decompose time-series patterns Rolling window statistics and moving averages How to measure autocorrelation How to diagnose and achieve stationarity Transforming a time series to achieve stationarity Handling instead of How to handle Time-series transformations in practice Univariate time-series models How to build autoregressive models How to build moving-average models How to build ARIMA models and extensions How to forecast macro fundamentals How to use time-series models to forecast volatility Multivariate time-series models Systems of equations The vector autoregressive (VAR) model Using the VAR model for macro forecasts Cointegration - time series with a shared trend The Engle-Granger two-step method The Johansen likelihood-ratio test Statistical arbitrage with cointegration How to select and trade comoving asset pairs Pairs trading in practice Preparing the strategy backtest Backtesting the strategy using backtrader Extensions - how to do better Summary Chapter 10: Bayesian ML - Dynamic Sharpe Ratios and Pairs Trading How Bayesian machine learning works How to update assumptions from empirical evidence Exact inference - maximum a posteriori estimation Deterministic and stochastic approximate inference
Probabilistic programming with РуМСЗ Bayesian machine learning with Theano [v] 240 242 245 250 254 254 255 256 257 258 259 260 261 261 263 265 266 267 268 270 272 276 277 277 278 281 282 282 283 283 285 288 292 294 294 295 296 297 298 301 305 305
Table of Contents The РуМСЗ workflow: predicting a recession Bayesian ML for trading Bayesian Sharpe ratio for performance comparison Bayesian rolling regression for pairs trading Stochastic volatility models Summary Chapter 11 : Random Forests - A Long-Short Strategy for Japanese Stocks Decision trees - learning rules from data How trees learn and apply decision rules Decision trees in practice Overfitting and regularization Hyperparameter tuning Random forests - making trees more reliable Why ensemble models perform better Bootstrap aggregation How to build a random forest How to train and tune a random forest Feature importance for random forests Out-of-bag testing Pros and cons of random forests Long-short signals for Japanese stocks The data ֊ Japanese equities The ML4T workflow with LightGBM The strategy - backtest with Zipline Summary Chapter 12: Boosting Your Trading Strategy Getting started - adaptive boosting The AdaBoost algorithm Using AdaBoost to predict monthly price moves Gradient boosting - ensembles for most tasks How to train and tune GBM models How to use gradient boosting with sklearn Using XGBoost, LightGBM, and CatBoost How algorithmic innovations boost performance A long-short trading strategy with boosting Generating signals with LightGBM and CatBoost Inside the black box - interpreting GBM results Backtesting a strategy based on a boosting ensemble Lessons learned and next steps Boosting for an intraday strategy Engineering features for high-frequency data Minute-frequency signals with LightGBM Evaluating the trading signal quality 305 317 317 320
323 326 327 328 328 330 336 338 345 345 346 349 350 352 352 353 353 354 355 362 364 365 366 367 368 370 372 374 378 379 383 383 391 399 401 402 402 404 405
Table of Contents Summary Chapter 13: Data-Driven Risk Factors and Asset Allocation with Unsupervised Learning Dimensionality reduction The curse of dimensionality Linear dimensionality reduction Manifold learning - nonlinear dimensionality reduction PCA for trading Data-driven risk factors Eigenportfolios Clustering к-means clustering Hierarchical clustering Density-based clustering Gaussian mixture models Hierarchical clustering for optimal portfolios How hierarchical risk parity works Backtesting HRP using an ML trading strategy Summary Chapter 14: Text Data for Trading - Sentiment Analysis ML with text data - from language to features Key challenges of working with text data The NLP workflow Applications From text to tokens - the NLP pipeline NLP pipeline with spaCy and textacy NLP with TextBlob Counting tokens - the document-term matrix The bag-of-words model Document-term matrix with scikit-learn Key lessons instead of lessons learned NLP for trading The naive Bayes classifier Classifying news articles Sentiment analysis with Twitter and Yelp data Summary Chapter 15: Topic Modeling - Summarizing Financial News Learning latent topics - Goals and approaches Latent semantic indexing How to implement LSI using sklearn Strengths and limitations Probabilistic latent semantic analysis How to implement pLSA using sklearn [vii] 406 407 408 409 411 418 421 421 424 426 427 429 431 432 433 433 435 438 439 440 440 441 443 443 444 448 449 450 451 455 455 456 457 458 462 463 464 465 466 468 469 470
Table of Contents Strengths and limitations Latent Dirichlet allocation How LDA works How to evaluate LDA topics How to implement LDA using sklearn How to visualize LDA results using pyLDAvis How to implement LDA using Gensim Modeling topics discussed in earnings calls Data preprocessing Model training and evaluation Running experiments Topic modeling for with financial news Summary Chapter 16: Word Embeddings for Earnings Calls and SEC Filings How word embeddings encode semantics How neural language models learn usage in context word2vec - scalable word and phrase embeddings Evaluating embeddings using semantic arithmetic How to use pretrained word vectors GloVe - Global vectors for word representation Custom embeddings for financial news Preprocessing - sentence detection and n-grams The skip-gram architecture in TensorFlow 2 Visualizing embeddings using TensorBoard How to train embeddings faster with Gensim word2vec for trading with SEC filings Preprocessing - sentence detection and n-grams Model training Sentiment analysis using doc2vec embeddings Creating doc2vec input from Yelp sentiment data Training a doc2vec model Training a classifier with document vectors Lessons learned and next steps New frontiers - pretrained transformer models Attention is all you need BERT - towards a more universal language model Trading on text data - lessons learned and next steps Summary Chapter 17: Deep Learning for Trading Deep learning - what s new and why it matters Hierarchical features tame high-dimensional data DL as representation learning How DL relates to ML and Al Designing an
NN [viii] 471 471 471 473 475 475 476 478 478 479 480 481 482 483 484 485 485 487 489 489 491 492 493 496 497 499 500 501 503 503 504 505 507 507 508 509 511 511 513 514 515 516 517 518
Table of Contents A simple feedforward neural network architecture Key design choices How to regularize deep NNs Training faster - optimizations for deep learning Summary - how to tune key hyperparameters A neural network from scratch in Python The input layer The hidden layer The output layer Forward propagation The cross-entropy cost function How to implement backpropusing Python Popular deep learning libraries Leveraging GPU acceleration How to use TensorFlow 2 How to use TensorBoard How to use PyTorch 1.4 Alternative options Optimizing an NN for a long-short strategy Engineering features to predict daily stock returns Defining an NN architecture framework Cross-validating design options to tune the NN Evaluating the predictive performance Backtesting a strategy based on ensembled signals How to further improve the results Summary Chapter 18: CNNs for Financial Time Series and Satellite Images How CNNs learn to model grid-like data From hand-coding to learning filters from data How the elements of a convolutional layer operate The evolution of CNN architectures: key innovations CNNs for satellite images and object detection LeNet5 - The first CNN with industrial applications AlexNet - reigniting deep learning research Transfer learning - faster training with less data Object detection and segmentation Object detection in practice CNNs for time-series data - predicting returns An autoregressive CNN with 1D convolutions CNN-TA - clustering time series in 2D format Summary Chapter 19: RNNs for Multivariate Time Series and Sentiment Analysis How recurrent neural nets work
519 520 522 523 525 526 526 527 528 529 529 529 534 534 535 537 538 541 542 542 542 543 545 547 549 549 551 552 553 554 558 559 560 563 565 573 573 577 577 581 589 591 592 [lx]
Table of Contents Unfolding a computational graph with cycles Backpropagation through time Alternative RNN architectures How to design deep RNNs The challenge of learning long-range dependencies Gated recurrent units RNNs for time series with TensorFlow 2 Univariate regression - predicting the S P 500 How to get time series data into shape for an RNN Stacked LSTM - predicting price moves and returns Multivariate time-series regression for macro data RNNs for text data LSTM with embeddings for sentiment classification Sentiment analysis with pretrained word vectors Predicting returns from SEC filing embeddings Summary Chapter 20: Autoencoders for Conditional Risk Factors and Asset Pricing Autoencoders for nonlinear feature extraction Generalizing linear dimensionality reduction Convolutional autoencoders for image compression Managing overfitting with regularized autoencoders Fixing corrupted data with denoising autoencoders Seq2seq autoencoders for time series features Generative modeling with variational autoencoders Implementing autoencoders with TensorFlow 2 How to prepare the data One-layer feedforward autoencoder Feedforward autoencoder with sparsity constraints Deep feedforward autoencoder Convolutional autoencoders Denoising autoencoders A conditional autoencoder for trading Sourcing stock prices and metadata information Computing predictive asset characteristics Creating the conditional autoencoder architecture Lessons learned and next steps Summary Chapter 21: Generative Adversarial Networks for Synthetic Time-Series Data Creating synthetic data with GANs Comparing
generative and discriminative models Adversarial training - a zero-sum game of trickery The rapid evolution of the GAN architecture zoo [x] 594 594 595 596 597 599 599 600 600 605 611 614 614 617 619 624 625 626 626 627 628 628 629 629 630 630 631 634 634 636 637 638 639 641 643 648 648 649 650 651 651 652
Table of Contents GAN applications to images and time-series data How to build a GAN using TensorFlow 2 Building the generator network Creating the discriminator network Setting up the adversarial training process Evaluating the results TimeGAN for synthetic financial data Learning to generate data across features and time Implementing TimeGAN using TensorFlow 2 Evaluating the quality of synthetic time-series data Lessons learned and next steps Summary Chapter 22: Deep Reinforcement Learning Building a Trading Agent Elements of a reinforcement learning system The policy - translating states into actions Rewards - learning from actions The value function - optimal choice for the long run With or without a model - look before you leap? How to solve reinforcement learning problems Key challenges in solving RL problems Fundamental approaches to solving RL problems Solving dynamic programming problems Finite Markov decision problems Policy iteration Value iteration Generalized policy iteration Dynamic programming in Python Q-learning - finding an optimal policy on the go Exploration versus exploitation - E-greedy policy The Q-learning algorithm How to train a Q-learning agent using Python Deep RL for trading with the OpenAI Gym Value function approximation with neural networks The Deep Q-learning algorithm and extensions Introducing the OpenAI Gym How to implement DDQN using TensorFlow 2 Creating a simple trading agent How to design a custom OpenAI trading environment Deep Q-learning on the stock market Lessons learned Summary Chapter 23: Conclusions and Next Steps Key takeaways
and lessons learned 653 655 655 656 657 660 660 661 663 672 678 678 679 680 681 681 682 682 682 683 683 684 684 687 688 688 689 694 695 695 695 696 697 697 699 700 704 705 709 711 711 713 714 ----------------------------------------------------------------------------- [ xi ] ------------------------------------------------------------------------
Table of Contents Data is the single most important ingredient Domain expertise ֊ telling the signal from the noise ML is a toolkit for solving problems with data Beware of backtest overfitting How to gain insights from black-box models ML for trading in practice Data management technologies ML tools Online trading platforms Conclusion Appendix: Alpha Factor Library Common alpha factors implemented in TA-Lib A key building block - moving averages Overlap studies - price and volatility trends Momentum indicators Volume and liquidity indicators Volatility indicators Fundamental risk factors WorldQuant s quest for formulaic alphas Cross-sectional and time-series functions Formulaic alpha expressions Bivariate and multivariate factor evaluation Information coefficient and mutual information Feature importance and SHAP values Comparison - the top 25 features for each metric Financial performance-Alphalens References 715 716 717 719 719 720 720 722 722 723 725 726 726 729 733 741 743 744 745 745 747 749 749 750 750 752 753 Index 769 [xü]
|
adam_txt |
Table of Contents Preface Chapter 1: Machine Learning for Trading - From Idea to Execution The rise of ML in the investment industry From electronic to high-frequency trading Factor investing and smart beta funds Algorithmic pioneers outperform humans ML and alternative data Crowdsourcing trading algorithms Designing and executing an ML-driven strategy Sourcing and managing data From alpha factor research to portfolio management Strategy backtesting ML for trading - strategies and use cases The evolution of algorithmic strategies Use cases of ML for trading Summary Chapter 2: Market and Fundamental Data - Sources and Techniques xiii 1 2 3 5 7 10 11 12 13 13 15 15 15 16 19 21 Market data reflects its environment 22 Market microstructure - the nuts and bolts 23 How to trade - different types of orders 23 Where to trade - from exchanges to dark pools 24 Working with high-frequency data 26 How to work with Nasdaq order book data 26 Communicating trades with the FIX protocol 27 The Nasdaq TotalView-ITCH data feed 27 From ticks to bars - how to regularize market data 35 AlgoSeek minute bars - equity quote and trade data 40 API access to market data 44 Remote data access using pandas 44 yfinance - scraping data from Yahoo! Finance 46 ------------------------------------------------------- [¡J ------------------------ :----------------------------
Table of Contents Quantopian Zipline Quandi Other market data providers How to work with fundamental data Financial statement data Other fundamental data sources Efficient data storage with pandas Summary Chapter 3: Alternative Data for Finance - Categories and Use Cases The alternative data revolution Sources of alternative data Individuals Business processes Sensors Criteria for evaluating alternative data Quality of the signal content Quality of the data Technical aspects The market for alternative data Data providers and use cases Working with alternative data Scraping OpenTable data Scraping and parsing earnings call transcripts Summary Chapter 4: Financial Feature Engineering - How to Research Alpha Factors Alpha factors in practice - from data to signals Building on decades of factor research Momentum and sentiment - the trend is your friend Value factors - hunting fundamental bargains Volatility and size anomalies Quality factors for quantitative investing Engineering alpha factors that predict returns How to engineer factors using pandas and NumPy How to use TA-Lib to create technical alpha factors Denoising alpha factors with the Kalman filter How to preprocess your noisy signals using wavelets From signals to trades - Zipline for backtests How to backtest a single-factor strategy Combining factors from diverse data sources Separating signal from noise with Alphalens Creating forward returns and factor quantiles Predictive performance by factor quantiles 48 48 50 50 51 51 56 57 58 59 60 62 62 63 63 65 65 67 68 69 70 72 72 77 80 81 82 84 84 88 90 92 94 94 99 100
104 106 106 109 111 112 113 ---------------------------------------------------- [н] --------------------------------------------------------
Table of Contents The information coefficient Factor turnover Alpha factor resources Alternative algorithmic trading libraries Summary Chapter 5: Portfolio Optimization and Performance Evaluation How to measure portfolio performance Capturing risk-return trade-offs in a single number The fundamental law of active management How to manage portfolio risk and return The evolution of modern portfolio management Mean-variance optimization Alternatives to mean-variance optimization Risk parity Risk factor investment Hierarchical risk parity Trading and managing portfolios with Zipline Scheduling signal generation and trade execution Implementing mean-variance portfolio optimization Measuring backtest performance with pyfolio Creating the returns and benchmark inputs Walk-forward testing - out-of-sample returns Summary Chapter 6: The Machine Learning Process How machine learning from data works The challenge - matching the algorithm to the task Supervised learning - teaching by example Unsupervised learning - uncovering useful patterns Reinforcement learning - learning by trial and error The machine learning workflow Basic walkthrough - к-nearest neighbors Framing the problem - from goals to metrics Collecting and preparing the data Exploring, extracting, and engineering features Selecting an ML algorithm Design and tune the model How to select a model using cross-validation How to implement cross-validation in Python Challenges with cross-validation in finance Parameter tuning with scikit-learn and Yellowbrick Summary Chapter 7: Linear Models - From Risk Factors to Return
Forecasts From inference to prediction 115 117 118 118 119 121 122 122 124 125 125 127 131 134 135 135 136 137 138 140 141 142 146 147 148 149 149 150 152 153 154 154 160 160 162 162 165 166 168 170 172 173 174 --------------------------------------------------- (այ--------------------------------------------------
Table of Contents The baseline model - multiple linear regression How to formulate the model How to train the model The Gauss-Markov theorem How to conduct statistical inference How to diagnose and remedy problems How to run linear regression in practice OLS with statsmodels Stochastic gradient descent with sklearn How to build a linear factor model From the CAPM to the Fama-French factor models Obtaining the risk factors Fama-Macbeth regression Regularizing linear regression using shrinkage How to hedge against overfitting How ridge regression works How lasso regression works How to predict returns with linear regression Preparing model features and forward returns Linear OLS regression using statsmodels Linear regression using scikit-learn Ridge regression using scikit-learn Lasso regression using sklearn Comparing the quality of the predictive signals Linear classification The logistic regression model How to conduct inference with statsmodels Predicting price movements with logistic regression Summary Chapter 8: The ML4T Workflow From Model to Strategy Backtesting How to backtest an ML-driven strategy Backtesting pitfalls and how to avoidthem Getting the data right Getting the simulation right Getting the statistics right How a backtesting engine works Vectorized versus event-driven backtesting Key implementation aspects backtrader - a flexible tool for local backtests Key concepts of backtrader's Cerebro architecture How to use backtrader in practice backtrader summary and next steps Zipline - scalable backtesting by Quantopian 175 175 176 179 180 181 184 184 186 187
188 189 191 194 194 195 196 197 197 203 205 208 210 212 212 213 215 217 219 221 222 223 224 225 226 227 228 230 232 232 235 239 239 ---------------------------------------------------- [iv]---------------------------------------------------------
Table of Contents Calendars and the Pipeline for robust simulations Ingesting your own bundles with minute data The Pipeline API - backtesting an ML signal How to train a model during the backtest Instead of How to use Summary Chapter 9: Time-Series Models for Volatility Forecasts and Statistical Arbitrage Tools for diagnostics and feature extraction How to decompose time-series patterns Rolling window statistics and moving averages How to measure autocorrelation How to diagnose and achieve stationarity Transforming a time series to achieve stationarity Handling instead of How to handle Time-series transformations in practice Univariate time-series models How to build autoregressive models How to build moving-average models How to build ARIMA models and extensions How to forecast macro fundamentals How to use time-series models to forecast volatility Multivariate time-series models Systems of equations The vector autoregressive (VAR) model Using the VAR model for macro forecasts Cointegration - time series with a shared trend The Engle-Granger two-step method The Johansen likelihood-ratio test Statistical arbitrage with cointegration How to select and trade comoving asset pairs Pairs trading in practice Preparing the strategy backtest Backtesting the strategy using backtrader Extensions - how to do better Summary Chapter 10: Bayesian ML - Dynamic Sharpe Ratios and Pairs Trading How Bayesian machine learning works How to update assumptions from empirical evidence Exact inference - maximum a posteriori estimation Deterministic and stochastic approximate inference
Probabilistic programming with РуМСЗ Bayesian machine learning with Theano [v] 240 242 245 250 254 254 255 256 257 258 259 260 261 261 263 265 266 267 268 270 272 276 277 277 278 281 282 282 283 283 285 288 292 294 294 295 296 297 298 301 305 305
Table of Contents The РуМСЗ workflow: predicting a recession Bayesian ML for trading Bayesian Sharpe ratio for performance comparison Bayesian rolling regression for pairs trading Stochastic volatility models Summary Chapter 11 : Random Forests - A Long-Short Strategy for Japanese Stocks Decision trees - learning rules from data How trees learn and apply decision rules Decision trees in practice Overfitting and regularization Hyperparameter tuning Random forests - making trees more reliable Why ensemble models perform better Bootstrap aggregation How to build a random forest How to train and tune a random forest Feature importance for random forests Out-of-bag testing Pros and cons of random forests Long-short signals for Japanese stocks The data ֊ Japanese equities The ML4T workflow with LightGBM The strategy - backtest with Zipline Summary Chapter 12: Boosting Your Trading Strategy Getting started - adaptive boosting The AdaBoost algorithm Using AdaBoost to predict monthly price moves Gradient boosting - ensembles for most tasks How to train and tune GBM models How to use gradient boosting with sklearn Using XGBoost, LightGBM, and CatBoost How algorithmic innovations boost performance A long-short trading strategy with boosting Generating signals with LightGBM and CatBoost Inside the black box - interpreting GBM results Backtesting a strategy based on a boosting ensemble Lessons learned and next steps Boosting for an intraday strategy Engineering features for high-frequency data Minute-frequency signals with LightGBM Evaluating the trading signal quality 305 317 317 320
323 326 327 328 328 330 336 338 345 345 346 349 350 352 352 353 353 354 355 362 364 365 366 367 368 370 372 374 378 379 383 383 391 399 401 402 402 404 405
Table of Contents Summary Chapter 13: Data-Driven Risk Factors and Asset Allocation with Unsupervised Learning Dimensionality reduction The curse of dimensionality Linear dimensionality reduction Manifold learning - nonlinear dimensionality reduction PCA for trading Data-driven risk factors Eigenportfolios Clustering к-means clustering Hierarchical clustering Density-based clustering Gaussian mixture models Hierarchical clustering for optimal portfolios How hierarchical risk parity works Backtesting HRP using an ML trading strategy Summary Chapter 14: Text Data for Trading - Sentiment Analysis ML with text data - from language to features Key challenges of working with text data The NLP workflow Applications From text to tokens - the NLP pipeline NLP pipeline with spaCy and textacy NLP with TextBlob Counting tokens - the document-term matrix The bag-of-words model Document-term matrix with scikit-learn Key lessons instead of lessons learned NLP for trading The naive Bayes classifier Classifying news articles Sentiment analysis with Twitter and Yelp data Summary Chapter 15: Topic Modeling - Summarizing Financial News Learning latent topics - Goals and approaches Latent semantic indexing How to implement LSI using sklearn Strengths and limitations Probabilistic latent semantic analysis How to implement pLSA using sklearn [vii] 406 407 408 409 411 418 421 421 424 426 427 429 431 432 433 433 435 438 439 440 440 441 443 443 444 448 449 450 451 455 455 456 457 458 462 463 464 465 466 468 469 470
Table of Contents Strengths and limitations Latent Dirichlet allocation How LDA works How to evaluate LDA topics How to implement LDA using sklearn How to visualize LDA results using pyLDAvis How to implement LDA using Gensim Modeling topics discussed in earnings calls Data preprocessing Model training and evaluation Running experiments Topic modeling for with financial news Summary Chapter 16: Word Embeddings for Earnings Calls and SEC Filings How word embeddings encode semantics How neural language models learn usage in context word2vec - scalable word and phrase embeddings Evaluating embeddings using semantic arithmetic How to use pretrained word vectors GloVe - Global vectors for word representation Custom embeddings for financial news Preprocessing - sentence detection and n-grams The skip-gram architecture in TensorFlow 2 Visualizing embeddings using TensorBoard How to train embeddings faster with Gensim word2vec for trading with SEC filings Preprocessing - sentence detection and n-grams Model training Sentiment analysis using doc2vec embeddings Creating doc2vec input from Yelp sentiment data Training a doc2vec model Training a classifier with document vectors Lessons learned and next steps New frontiers - pretrained transformer models Attention is all you need BERT - towards a more universal language model Trading on text data - lessons learned and next steps Summary Chapter 17: Deep Learning for Trading Deep learning - what's new and why it matters Hierarchical features tame high-dimensional data DL as representation learning How DL relates to ML and Al Designing an
NN [viii] 471 471 471 473 475 475 476 478 478 479 480 481 482 483 484 485 485 487 489 489 491 492 493 496 497 499 500 501 503 503 504 505 507 507 508 509 511 511 513 514 515 516 517 518
Table of Contents A simple feedforward neural network architecture Key design choices How to regularize deep NNs Training faster - optimizations for deep learning Summary - how to tune key hyperparameters A neural network from scratch in Python The input layer The hidden layer The output layer Forward propagation The cross-entropy cost function How to implement backpropusing Python Popular deep learning libraries Leveraging GPU acceleration How to use TensorFlow 2 How to use TensorBoard How to use PyTorch 1.4 Alternative options Optimizing an NN for a long-short strategy Engineering features to predict daily stock returns Defining an NN architecture framework Cross-validating design options to tune the NN Evaluating the predictive performance Backtesting a strategy based on ensembled signals How to further improve the results Summary Chapter 18: CNNs for Financial Time Series and Satellite Images How CNNs learn to model grid-like data From hand-coding to learning filters from data How the elements of a convolutional layer operate The evolution of CNN architectures: key innovations CNNs for satellite images and object detection LeNet5 - The first CNN with industrial applications AlexNet - reigniting deep learning research Transfer learning - faster training with less data Object detection and segmentation Object detection in practice CNNs for time-series data - predicting returns An autoregressive CNN with 1D convolutions CNN-TA - clustering time series in 2D format Summary Chapter 19: RNNs for Multivariate Time Series and Sentiment Analysis How recurrent neural nets work
519 520 522 523 525 526 526 527 528 529 529 529 534 534 535 537 538 541 542 542 542 543 545 547 549 549 551 552 553 554 558 559 560 563 565 573 573 577 577 581 589 591 592 [lx]
Table of Contents Unfolding a computational graph with cycles Backpropagation through time Alternative RNN architectures How to design deep RNNs The challenge of learning long-range dependencies Gated recurrent units RNNs for time series with TensorFlow 2 Univariate regression - predicting the S P 500 How to get time series data into shape for an RNN Stacked LSTM - predicting price moves and returns Multivariate time-series regression for macro data RNNs for text data LSTM with embeddings for sentiment classification Sentiment analysis with pretrained word vectors Predicting returns from SEC filing embeddings Summary Chapter 20: Autoencoders for Conditional Risk Factors and Asset Pricing Autoencoders for nonlinear feature extraction Generalizing linear dimensionality reduction Convolutional autoencoders for image compression Managing overfitting with regularized autoencoders Fixing corrupted data with denoising autoencoders Seq2seq autoencoders for time series features Generative modeling with variational autoencoders Implementing autoencoders with TensorFlow 2 How to prepare the data One-layer feedforward autoencoder Feedforward autoencoder with sparsity constraints Deep feedforward autoencoder Convolutional autoencoders Denoising autoencoders A conditional autoencoder for trading Sourcing stock prices and metadata information Computing predictive asset characteristics Creating the conditional autoencoder architecture Lessons learned and next steps Summary Chapter 21: Generative Adversarial Networks for Synthetic Time-Series Data Creating synthetic data with GANs Comparing
generative and discriminative models Adversarial training - a zero-sum game of trickery The rapid evolution of the GAN architecture zoo [x] 594 594 595 596 597 599 599 600 600 605 611 614 614 617 619 624 625 626 626 627 628 628 629 629 630 630 631 634 634 636 637 638 639 641 643 648 648 649 650 651 651 652
Table of Contents GAN applications to images and time-series data How to build a GAN using TensorFlow 2 Building the generator network Creating the discriminator network Setting up the adversarial training process Evaluating the results TimeGAN for synthetic financial data Learning to generate data across features and time Implementing TimeGAN using TensorFlow 2 Evaluating the quality of synthetic time-series data Lessons learned and next steps Summary Chapter 22: Deep Reinforcement Learning Building a Trading Agent Elements of a reinforcement learning system The policy - translating states into actions Rewards - learning from actions The value function - optimal choice for the long run With or without a model - look before you leap? How to solve reinforcement learning problems Key challenges in solving RL problems Fundamental approaches to solving RL problems Solving dynamic programming problems Finite Markov decision problems Policy iteration Value iteration Generalized policy iteration Dynamic programming in Python Q-learning - finding an optimal policy on the go Exploration versus exploitation - E-greedy policy The Q-learning algorithm How to train a Q-learning agent using Python Deep RL for trading with the OpenAI Gym Value function approximation with neural networks The Deep Q-learning algorithm and extensions Introducing the OpenAI Gym How to implement DDQN using TensorFlow 2 Creating a simple trading agent How to design a custom OpenAI trading environment Deep Q-learning on the stock market Lessons learned Summary Chapter 23: Conclusions and Next Steps Key takeaways
and lessons learned 653 655 655 656 657 660 660 661 663 672 678 678 679 680 681 681 682 682 682 683 683 684 684 687 688 688 689 694 695 695 695 696 697 697 699 700 704 705 709 711 711 713 714 ----------------------------------------------------------------------------- [ xi ] ------------------------------------------------------------------------
Table of Contents Data is the single most important ingredient Domain expertise ֊ telling the signal from the noise ML is a toolkit for solving problems with data Beware of backtest overfitting How to gain insights from black-box models ML for trading in practice Data management technologies ML tools Online trading platforms Conclusion Appendix: Alpha Factor Library Common alpha factors implemented in TA-Lib A key building block - moving averages Overlap studies - price and volatility trends Momentum indicators Volume and liquidity indicators Volatility indicators Fundamental risk factors WorldQuant's quest for formulaic alphas Cross-sectional and time-series functions Formulaic alpha expressions Bivariate and multivariate factor evaluation Information coefficient and mutual information Feature importance and SHAP values Comparison - the top 25 features for each metric Financial performance-Alphalens References 715 716 717 719 719 720 720 722 722 723 725 726 726 729 733 741 743 744 745 745 747 749 749 750 750 752 753 Index 769 [xü] |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
author | Jansen, Stefan |
author_GND | (DE-588)143872338 |
author_facet | Jansen, Stefan |
author_role | aut |
author_sort | Jansen, Stefan |
author_variant | s j sj |
building | Verbundindex |
bvnumber | BV046950335 |
classification_rvk | QH 500 QK 620 ST 300 |
ctrlnum | (OCoLC)1220895167 (DE-599)HBZHT020578539 |
discipline | Informatik Wirtschaftswissenschaften |
discipline_str_mv | Informatik Wirtschaftswissenschaften |
edition | Second edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01910nam a2200433 c 4500</leader><controlfield tag="001">BV046950335</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20230124 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">201020s2020 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781839217715</subfield><subfield code="9">978-1-83921-771-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1220895167</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HBZHT020578539</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-573</subfield><subfield code="a">DE-1043</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-19</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 500</subfield><subfield code="0">(DE-625)141607:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QK 620</subfield><subfield code="0">(DE-625)141668:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Jansen, Stefan</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)143872338</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Machine learning for algorithmic trading</subfield><subfield code="b">predictive models to extract signals from market and alternative data for systematic trading strategies with Python</subfield><subfield code="c">Stefan Jansen</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Second edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham ; Mumbai</subfield><subfield code="b">Packt</subfield><subfield code="c">July 2020</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xii, 790 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Wertpapierhandelssystem</subfield><subfield code="0">(DE-588)4510686-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Algorithmus</subfield><subfield code="0">(DE-588)4001183-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Wertpapierhandelssystem</subfield><subfield code="0">(DE-588)4510686-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Algorithmus</subfield><subfield code="0">(DE-588)4001183-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032358867&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032358867</subfield></datafield></record></collection> |
id | DE-604.BV046950335 |
illustrated | Illustrated |
index_date | 2024-07-03T15:40:54Z |
indexdate | 2024-07-10T08:58:23Z |
institution | BVB |
isbn | 9781839217715 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032358867 |
oclc_num | 1220895167 |
open_access_boolean | |
owner | DE-573 DE-1043 DE-739 DE-19 DE-BY-UBM |
owner_facet | DE-573 DE-1043 DE-739 DE-19 DE-BY-UBM |
physical | xii, 790 Seiten Illustrationen, Diagramme |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | Packt |
record_format | marc |
spelling | Jansen, Stefan Verfasser (DE-588)143872338 aut Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python Stefan Jansen Second edition Birmingham ; Mumbai Packt July 2020 © 2020 xii, 790 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Wertpapierhandelssystem (DE-588)4510686-1 gnd rswk-swf Python Programmiersprache (DE-588)4434275-5 gnd rswk-swf Algorithmus (DE-588)4001183-5 gnd rswk-swf Wertpapierhandelssystem (DE-588)4510686-1 s Algorithmus (DE-588)4001183-5 s Maschinelles Lernen (DE-588)4193754-5 s Python Programmiersprache (DE-588)4434275-5 s DE-604 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032358867&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Jansen, Stefan Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python Maschinelles Lernen (DE-588)4193754-5 gnd Wertpapierhandelssystem (DE-588)4510686-1 gnd Python Programmiersprache (DE-588)4434275-5 gnd Algorithmus (DE-588)4001183-5 gnd |
subject_GND | (DE-588)4193754-5 (DE-588)4510686-1 (DE-588)4434275-5 (DE-588)4001183-5 |
title | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python |
title_auth | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python |
title_exact_search | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python |
title_exact_search_txtP | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python |
title_full | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python Stefan Jansen |
title_fullStr | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python Stefan Jansen |
title_full_unstemmed | Machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with Python Stefan Jansen |
title_short | Machine learning for algorithmic trading |
title_sort | machine learning for algorithmic trading predictive models to extract signals from market and alternative data for systematic trading strategies with python |
title_sub | predictive models to extract signals from market and alternative data for systematic trading strategies with Python |
topic | Maschinelles Lernen (DE-588)4193754-5 gnd Wertpapierhandelssystem (DE-588)4510686-1 gnd Python Programmiersprache (DE-588)4434275-5 gnd Algorithmus (DE-588)4001183-5 gnd |
topic_facet | Maschinelles Lernen Wertpapierhandelssystem Python Programmiersprache Algorithmus |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032358867&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT jansenstefan machinelearningforalgorithmictradingpredictivemodelstoextractsignalsfrommarketandalternativedataforsystematictradingstrategieswithpython |