Time series prediction with Machine Learning

au
6 min readJan 2, 2022

--

Before moving to the problem statement, let us understand the fundamental terms related to working with time-series data.

Time Series Data: Any data that has a time component involved in it is termed as a time-series data. For example, the number of orders made on a food ordering app per day is an example of time-series data.

Time Series Analysis: Performing analysis on a time-series data to find useful insights and patterns in termed as time series analysis. Let’s take a food ordering app example again. The app might have the data for every day logged in per hour. They might notice that in this data, the number of orders is significantly higher in, say, the 1–2 PM time slot but is significantly lower in the 3–4 PM time slot. This information might be useful for them as they would then be able to estimate the number of delivery boys required at a particular time of the day. Hence, time series analysis is indispensable while working with any time series data.

Time Series Forecasting: Time series forecasting is basically looking at the past data to make predictions into the future. Say that the food ordering app wants to predict the number of orders per day for the next month in order to plan the resources better. For this, they will look at tons of past data and use it in order to forecast accurately.

There are 2 types:
1. Quantitative (Data driven)
— Data is available
— Historic patterns repeat
— No bias involved
— Captures complex patterns
2. Qualitative (Data driven)
— Data is NOT available
— Historic patterns may NOT repeat
— Bias is involved as experts opinion is involved
— May NOT capture complex patterns

In-short, based off the historic data or analysis of experts over a period of time, being able to predict is known as Time series prediction or forecasting.

What are the use cases of time series forecasting in real world applications?

  1. Financial and Business domain : Data accumulated over time can be used to predict the future growth of company, demands, pricing, risk calculation and many more. Companies can use this data to take important decisions for purchases, production, sustainability in the market, strategic growth, increase profits, stock market, hiring new talent and so on.

2. Medical domain : With the wearable technology coming into picture, there has been boost in time series prediction for medical domain. Intelligent watches available in the market are quickly collecting important data points on sleep & exercise routine, heart rate over time, blood oxygen levels and much more. Once the data points are there over a period of time, its much easier to identify patterns. There are smart electronic devices which can learn these routines and be able to predict earlier before the event even happens. One such example is Deanna Recktenwald from USA, who’s life was saved by apple watch as it was able to save her life by helping in detection of rare heart condition. This is just one example which adds testimony to what these smart devices are capable of.
Adding on, this concept has been used for prediction in pandemic for infection rate, increased hospital care/need and so on.

3. Weather : Thanks to weather forecasting, we know in advance if we need to carry umbrella while stepping out from the comforts of your home. There are weather stations all over the globe that record the weather conditions, well connected to each other and able to accumulate vast amount of data across the globe.

Now that we know the basics, how do I start? How do we go about Time series forecasting?

  1. Identify the problem that needs to be solved
  2. Collect data over period of time
  3. Analyze and analyze the data
  4. Build, evaluate and forecast the model

Next is what you want to forecast?

  1. Quantity (for example : profit, revenue, price)
  2. Granularity (for example : level as in store, country, state, region, world)
  3. Frequency (for example : daily, weekly, monthly, yearly)
  4. Horizon (for example : how big is the problem short term for daily operational purposes, medium for target setting for hiring, purchases for few months, large for yearly planning on growth of company like entering new market/location, strategic decisions.

The one thing to keep in mind before moving forward is that there are some caveats associated with a time series forecasting. These caveats revolve around the steps you learnt about while defining the problem.

  1. The Granularity Rule: The more aggregate your forecasts, the more accurate you are in your predictions simply because aggregated data has lesser variance and hence, lesser noise. As a thought experiment, suppose you work at ABC, an online entertainment streaming service, and you want to predict the number of views for a few newly launched TV show in Mumbai for the next one year. Now, would you be more accurate in your predictions if you predicted at the city-level or if you go at an area-level? Obviously, accurately predicting the views from each area might be difficult but when you sum up the number of views for each area and present your final predictions at a city-level, your predictions might be surprisingly accurate. This is because, for some areas, you might have predicted lower views than the actual whereas, for some, the number of predicted views might be higher. And when you sum all of these up, the noise and variance cancel each other out, leaving you with a good prediction. Hence, you should not make predictions at very granular levels.
  2. The Frequency Rule: This rule tells you to keep updating your forecasts regularly to capture any new information that comes in. Let’s continue with the ABC, an online entertainment streaming service, an example where the problem is to predict the number of views for a newly launched TV show in Mumbai for the next year. Now, if you keep the frequency too low, you might not be able to capture accurately the new information coming in. For example, say, your frequency for updating the forecasts is 3 months. Now, due to the COVID-19 pandemic, the residents may be locked in their homes for around 2–3 months during which the number of views will significantly increase. Now, if the frequency of your forecast is only 3 months, you will not be able to capture the increase in views which may incur significant losses and lead to mismanagement.
  3. The Horizon Rule: When you have the horizon planned for a large number of months into the future, you are more likely to be accurate in the earlier months as compared to the later ones. Let’s again go back to ABC, an online entertainment streaming service, example. Suppose that the online entertainment streaming service made a prediction for the number of views for the next 6 months in December 2019. Now, it may have been quite accurate for the first two months, but due to the unforeseen COVID-19 situation, the actual number of view in the next couple of months would have been significantly higher than predicted because of everyone staying at home. The farther ahead we go into the future, the more uncertain we are about the forecasts.

To summarize, in this article I have tried to give you a basic understanding about the time series modeling and its real time applications. In future articles I will delve further into the advance concepts around time series modeling and get our hands dirty by solving real time problems. Until then happy learning :)

--

--

au
au

Written by au

Healthcare IT specialist | Problem solver | Technologist | Knowledge of Python, ML and AI | MS (Liverpool John Moores University)

No responses yet