Time Series Analysis with Pandas
Python’s pandas library is frequently used to import, manage, and analyze datasets in a variety of formats. In this article, we’ll use it to analyze Amazon’s stock prices and perform some basic time series operations.
Table of Contents:
- Introduction
- Time series data
- Importing stock data and necessary Python libraries
- Pandas for time series analysis
- Time shifting
- Rolling windows
- Conclusion
Introduction
Stock markets play an important role in the economy of a country. Governments, private sector companies, and central banks keep a close eye on fluctuations in the market as they have much to gain or lose from it. Due to the volatile nature of the stock market, analyzing stock prices is tricky– this is where Python comes in. With built-in tools and external libraries, Python makes the process of analyzing complex stock market data seamless and easy.
Prerequisites
We’ll be analyzing stock data with Python 3, pandas and Matplotlib. To fully benefit from this article, you should be familiar with the basics of pandas as well as the plotting library called Matplotlib.
Time series data
Time series data is a sequence of data points in chronological order that is used by businesses to analyze past data and make future predictions. These data points are a set of observations at specified times and equal intervals, typically with a datetime index and corresponding value. Common examples of time series data in our day-to-day lives include:
- Measuring weather temperatures
- Measuring the number of taxi rides per month
- Predicting a company’s stock prices for the next day
Variations of time series data
- Trend Variation: moves up or down in a reasonably predictable pattern over a long period of time.
- Seasonality Variation: regular and periodic; repeats itself over a specific period, such as a day, week, month, season, etc.
- Cyclical Variation: corresponds with business or economic ‘boom-bust’ cycles, or is cyclical in some other form
- Random Variation: erratic or residual; doesn’t fall under any of the above three classifications.
Here are the four variations of time series data visualized:
Importing stock data and necessary Python libraries
To demonstrate the use of pandas for stock analysis, we will be using Amazon stock prices from 2013 to 2018. We’re pulling the data from Quandl, a company offering a Python API for sourcing a la carte market data. A CSV file of the data in this article can be downloaded from the article’s repository.
Fire up the editor of your choice and type in the following code to import the libraries and data that correspond to this article.
Example code for this article may be found at the Kite Blog repository on Github.
# Importing required modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Settings for pretty nice plots
plt.style.use('fivethirtyeight')
plt.show()
# Reading in the data
data = pd.read_csv('amazon_stock.csv')
A first look at Amazon’s stock Prices
Let’s look at the first few columns of the dataset:
# Inspecting the data
data.head()
Let’s get rid of the first two columns as they don’t add any value to the dataset.
Back to Top