Our aim is to conduct sentiment analysis on Apple Inc. stock by analysing diverse sources, including media coverage, industry reports, social media reviews, and investor opinions from news headlines. By leveraging sentiment analysis techniques, our project intends to uncover patterns in sentiment that may correlate with stock price movements. The insights gained could provide valuable perspectives on how public perception influences Apple’s stock performance, offering potential benefits for investors and analysts.
Introduction
In an era dominated by data, the dynamics of financial markets have become increasingly intricate. Investors and analysts are continuously seeking innovative approaches to gain insights into stock price movements. This project embarks on a comprehensive exploration of sentiment analysis as a tool to unravel the nuanced relationship between public sentiment, news events, and the stock performance of Apple Inc.
Question
How can sentiment analysis of news headlines contribute to understanding the stock price movements of Apple Inc., and what insights can be gained regarding the interconnectedness of external factors and their impact on Apple’s stock performance in the broader market context?
# Imports essential libraries for data manipulation, visualization, sentiment analysis, and financial data retrieval using Yahoo Finance.import warningswarnings.filterwarnings("ignore")from dateutil import parserimport requestsimport pandas as pdimport matplotlib.pyplot as pltimport yfinance as yffrom vaderSentiment.vaderSentiment import SentimentIntensityAnalyzerfrom datetime import datetime, timedeltaimport plotly.graph_objects as goimport plotly.express as pxfrom plotly.subplots import make_subplotsimport seaborn as snsfrom wordcloud import WordCloudimport matplotlib.pyplot as plt
Code
API_KEY ="9dcab5d0d86940459623ec7dea5c8d36"
Code
# Create a Ticker object for the stock with symbol 'AAPL' (Apple Inc.)apple=yf.Ticker("AAPL")# Retrieve historical stock data for the last month for the specified tickerapple_data=apple.history(period='1mo')#Restting date from index to column and converting it to '%Y-%m-%d' formatapple_data.reset_index(inplace=True)apple_data['Date'] = apple_data['Date'].dt.strftime('%Y-%m-%d')
Code
# Setting up necessary parameters for the API requestapi_key=API_KEYstock_symbol ="AAPL"# Setting up date range for fetching news data (last 30 days)end_date = datetime.now()start_date = end_date - timedelta(days=29)# Formatting dates for the API requestfrom_date = start_date.strftime("%Y-%m-%d")to_date = end_date.strftime("%Y-%m-%d")# Constructing the final URL for fetching news within the date rangenews_url =f"https://newsapi.org/v2/everything?q={stock_symbol}&apiKey={api_key}&from={from_date}&to={to_date}&language=en"# Making a request to the API with the specified date rangeresponse = requests.get(news_url)# Checking if the API request was successful (status code 200)if response.status_code ==200:# Parsing the JSON response to retrieve news articles news_data = response.json() articles = news_data['articles']# Extracting headlines and publication dates from articles headlines = [(article['title'], article['publishedAt']) for article in articles]else:# Displaying an error message if the API request failsprint("Failed to retrieve news data.")
Code
# Initialize an empty list to store Apple-related headlines with their respective datesapple_related_headlines=[]# Iterate through headlines and their associated publication datesfor headline, _ in headlines:try:# Attempt to parse the date using a parser (assuming it's in a valid format) date = parser.parse(_)# Check if the headline contains the keyword 'Apple'if'Apple'in headline:# If it does, append the headline and its date to the list apple_related_headlines.append((headline, date))exceptValueError:# Ignore and continue if there's an issue parsing the datepass# Sort the list of Apple-related headlines by dateapple_related_headlines.sort(key=lambda x: x[1])# Display the sorted Apple-related headlines along with their dates#for data in apple_related_headlines:# print(f'Headline: {data[0]}\nDate: {data[1]}\n')# Create a DataFrame from the list of Apple-related headlines with datesdf_apple_related_headlines = pd.DataFrame(apple_related_headlines, columns= ['Headlines', 'date'])# Remove duplicate headlines to retain unique entriesdf_unique_apple_related_headlines = df_apple_related_headlines.drop_duplicates(subset=['Headlines'])
Code
text =' '.join(df_unique_apple_related_headlines['Headlines'])# Generate the word cloudwordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)# Display the word cloud using matplotlibplt.figure(figsize=(10, 6))plt.imshow(wordcloud, interpolation='bilinear')plt.axis('off')plt.title('Word Cloud of Headlines')plt.show()
Code
# Initialize the SentimentIntensityAnalyzeranalyzer = SentimentIntensityAnalyzer()# Initialize lists to store sentiment scores for each headlinesentiments = [] # Stores headlinesneg_scores = [] # Stores negative sentiment scoresneu_scores = [] # Stores neutral sentiment scorespos_scores = [] # Stores positive sentiment scorescompound_scores = [] # Stores compound sentiment scores# Iterate through each headline in the DataFrame to analyze sentimentfor sentence in df_unique_apple_related_headlines['Headlines']:# Perform sentiment analysis on each headline using the analyzer vs = analyzer.polarity_scores(sentence)# Append headline, negative, neutral, positive, and compound scores to respective lists sentiments.append(sentence) neg_scores.append(vs['neg']) neu_scores.append(vs['neu']) pos_scores.append(vs['pos']) compound_scores.append(vs['compound'])
Code
# Create a DataFrame to store sentiment scores of the headlinessentiment_df = pd.DataFrame({'Headlines': sentiments,'Negative Score': neg_scores,'Neutral Score': neu_scores,'Positive Score': pos_scores,'Compound Score': compound_scores})
Code
# Merge the unique Apple-related headlines DataFrame and the sentiment scores DataFramemerged_df= pd.merge(df_unique_apple_related_headlines, sentiment_df, how="inner", on=["Headlines"])# Convert the 'date' column to required formatmerged_df['date'] = merged_df['date'].dt.strftime('%Y-%m-%d')
Code
# Creating traces for the subplotstrace1 = go.Scatter(x=apple_data['Date'], y=apple_data['Close'], mode='lines', name='AAPL Close Price')trace2 = go.Scatter(x=merged_df['date'], y=merged_df['Compound Score'], mode='lines', name='Sentiment Score', line=dict(color='orange'))# Creating subplot figurefig = make_subplots(rows=2, cols=1, subplot_titles=('AAPL Stock Price', 'Sentiment Analysis of News Headlines'))# Adding traces to subplotsfig.add_trace(trace1, row=1, col=1)fig.add_trace(trace2, row=2, col=1)# Updating layoutfig.update_layout(height=600, width=800, title_text="AAPL Stock Price and Sentiment Analysis")# Updating x-axis and y-axis labelsfig.update_xaxes(title_text="Date", row=2, col=1)fig.update_yaxes(title_text="Price", row=1, col=1)fig.update_yaxes(title_text="Sentiment Score", row=2, col=1)# Displaying the interactive subplotfig.show()
Code
# Stock symbols to analyze (add more symbols as needed)company_stock_symbols = ["LNVGY", "DELL", "HPE", "MSFT", "SSNLF"]api_key = API_KEYall_headlines = []# Set the date range for news retrievalend_date = datetime.now()start_date = end_date - timedelta(days=20)from_date = start_date.strftime("%Y-%m-%d")to_date = end_date.strftime("%Y-%m-%d")# Retrieve news data for each stock symbolfor stock_symbol in company_stock_symbols: query_params = {'q': f'{stock_symbol}','apiKey': api_key,'language': 'en','country': 'us', } news_url =f"https://newsapi.org/v2/everything?q={stock_symbol}&apiKey={api_key}&from={from_date}&to={to_date}&language=en" response = requests.get(news_url)if response.status_code ==200: news_data = response.json() articles = news_data['articles'] headlines = [(article['title'], article['publishedAt']) for article in articles] all_headlines.extend(headlines)else:print(f"Failed to retrieve news data for {stock_symbol}.")# Filter headlines containing the company namecompany_related_headlines = []for headline, _ in all_headlines:try: date = parser.parse(_)for stock_symbol in company_stock_symbols:if stock_symbol in headline: company_related_headlines.append((headline, date, stock_symbol))exceptValueError:pass# Sort headlines by datecompany_related_headlines.sort(key=lambda x: x[1])# Display company-related headlines#for data in company_related_headlines:# print(f'Headline: {data[0]}\nDate: {data[1]}\nSymbol: {data[2]}\n')# Sentiment analysis using NLTK's VADERanalyzer = SentimentIntensityAnalyzer()sentiments = []neg_scores = []neu_scores = []pos_scores = []compound_scores = []# Analyze sentiment for each headlinefor sentence in company_related_headlines: vs = analyzer.polarity_scores(sentence[0]) sentiments.append(sentence[0]) neg_scores.append(vs['neg']) neu_scores.append(vs['neu']) pos_scores.append(vs['pos']) compound_scores.append(vs['compound'])# Create a DataFrame for sentiment scorescompany_sentiment_df = pd.DataFrame({'Headlines': sentiments,'Negative Score': neg_scores,'Neutral Score': neu_scores,'Positive Score': pos_scores,'Compound Score': compound_scores})# Merge sentiment data with company-related headlinescompany_merged_df = pd.merge(pd.DataFrame(company_related_headlines, columns=['Headlines', 'date', 'Symbol']), company_sentiment_df, how="inner", on=["Headlines"])# Format date columnscompany_merged_df['date'] = company_merged_df['date'].dt.strftime('%Y-%m-%d')company_merged_df['date'] = pd.to_datetime(company_merged_df['date'])# Group sentiment scores by dategrouped_df = company_merged_df.groupby('date')['Compound Score'].mean().reset_index()# Display grouped sentiment scores#print(grouped_df)# Retrieve Apple stock data using Yahoo Finance APIapple = yf.Ticker("AAPL")apple_data = apple.history(period='1mo')# Format date columns for mergingapple_data['date'] = apple_data.index.to_series().dt.strftime('%Y-%m-%d')grouped_df['date'] = pd.to_datetime(grouped_df['date'])apple_data['date'] = pd.to_datetime(apple_data['date'])# Merge sentiment scores with Apple stock datafull_merged_data = pd.merge(grouped_df, apple_data, how='inner', left_on='date', right_on='date')# Calculate correlation between 'Compound Score' and 'Close'correlation = full_merged_data['Compound Score'].corr(full_merged_data['Close'])#print(f"Correlation between Compound Score and Closing Price: {correlation}")#print(full_merged_data)# Plotting# Calculate correlation between 'Compound Score' and 'Close'reversed_correlation = full_merged_data['Compound Score'].corr(full_merged_data['Close'])#print(f"Correlation between Compound Score and Closing Price: {reversed_correlation}")# Create subplots with two y-axesfig = make_subplots(specs=[[{"secondary_y": True}]])fig.update_layout( title='Competitors Sentiment Scores vs. Apple Closing Price', xaxis_title='Date', plot_bgcolor='white', # Set plot background color paper_bgcolor='white', # Set paper background color)# Identify increases and decreases in Closing Pricepositive_changes = full_merged_data['Close'].diff().gt(0)negative_changes = full_merged_data['Close'].diff().lt(0)# Add traces for the Reversed Compound Score, Closing Price, and changesfig.add_trace(go.Scatter(x=full_merged_data['date'], y=full_merged_data['Compound Score'], mode='lines+markers', name='Compound Score'), secondary_y=False)fig.add_trace(go.Scatter(x=full_merged_data['date'], y=full_merged_data['Close'], mode='lines+markers', name='Closing Price', line=dict(color='darkgoldenrod')), secondary_y=True)fig.add_trace(go.Scatter(x=full_merged_data['date'][positive_changes], y=full_merged_data['Close'][positive_changes], mode='markers', name='Positive Change', marker=dict(color='green', size=8)), secondary_y=True)fig.add_trace(go.Scatter(x=full_merged_data['date'][negative_changes], y=full_merged_data['Close'][negative_changes], mode='markers', name='Negative Change', marker=dict(color='red', size=8)), secondary_y=True)# Update y-axis labels and stylesfig.update_yaxes(title_text='Compound Score', secondary_y=False, color='blue', showline=True, linecolor='blue', linewidth=2)fig.update_yaxes(title_text='Closing Price', secondary_y=True, color='darkgoldenrod', showline=True, linecolor='darkgoldenrod', linewidth=2)fig.update_xaxes(showgrid=True, zeroline=True, gridcolor='lightgrey', gridwidth=0.5, showline=True, linecolor='#2a3f5f', linewidth=2) # Show minimal gridlines# Display the plotfig.show()
Code
# Global factors vs Apple stockimport seaborn as sns# Stock symbols to analyze (add more symbols as needed)globalfactors_symbols = ["pandemic", "tornado", "hurricane", "Pandemic", "economic crisis","earthquake","Global recession", "currency"]api_key = API_KEYall_headlines = []# Set the date range for news retrievalend_date = datetime.now()start_date = end_date - timedelta(days=20)from_date = start_date.strftime("%Y-%m-%d")to_date = end_date.strftime("%Y-%m-%d")# Retrieve news data for each stock symbolfor stock_symbol in globalfactors_symbols: query_params = {'q': f'{stock_symbol}','apiKey': api_key,'language': 'en','country': 'us', } news_url =f"https://newsapi.org/v2/everything?q={stock_symbol}&apiKey={api_key}&from={from_date}&to={to_date}&language=en" response = requests.get(news_url)if response.status_code ==200: news_data = response.json() articles = news_data['articles'] headlines = [(article['title'], article['publishedAt']) for article in articles] all_headlines.extend(headlines)else:print(f"Failed to retrieve news data for {stock_symbol}.")# Filter headlines containing the global factors symbolsglobalfactors_related_headlines = []for headline, _ in all_headlines:try: date = parser.parse(_)for stock_symbol in globalfactors_symbols:if stock_symbol in headline: globalfactors_related_headlines.append((headline, date, stock_symbol))exceptValueError:pass# Sort headlines by dateglobalfactors_related_headlines.sort(key=lambda x: x[1])# Display company-related headlines#for data in globalfactors_related_headlines:# print(f'Headline: {data[0]}\nDate: {data[1]}\nSymbol: {data[2]}\n')# Sentiment analysis using NLTK's VADERanalyzer = SentimentIntensityAnalyzer()sentiments = []neg_scores = []neu_scores = []pos_scores = []compound_scores = []# Analyze sentiment for each headlinefor sentence in globalfactors_related_headlines: vs = analyzer.polarity_scores(sentence[0]) sentiments.append(sentence[0]) neg_scores.append(vs['neg']) neu_scores.append(vs['neu']) pos_scores.append(vs['pos']) compound_scores.append(vs['compound'])# Create a DataFrame for sentiment scorescompany_sentiment_df = pd.DataFrame({'Headlines': sentiments,'Negative Score': neg_scores,'Neutral Score': neu_scores,'Positive Score': pos_scores,'Compound Score': compound_scores})# Merge sentiment data with company-related headlinesfactors_merged_df = pd.merge(pd.DataFrame(globalfactors_related_headlines, columns=['Headlines', 'date', 'Symbol']), company_sentiment_df, how="inner", on=["Headlines"])# Format date columnsfactors_merged_df['date'] = factors_merged_df['date'].dt.strftime('%Y-%m-%d')factors_merged_df['date'] = pd.to_datetime(factors_merged_df['date'])# Group sentiment scores by dategrouped_df = factors_merged_df.groupby('date')['Compound Score'].mean().reset_index()# Display grouped sentiment scores#print(grouped_df)# Retrieve Apple stock data using Yahoo Finance APIapple = yf.Ticker("AAPL")apple_data = apple.history(period='1mo')# Format date columns for mergingapple_data['date'] = apple_data.index.to_series().dt.strftime('%Y-%m-%d')grouped_df['date'] = pd.to_datetime(grouped_df['date'])apple_data['date'] = pd.to_datetime(apple_data['date'])# Merge sentiment scores with Apple stock datafull_merged_data = pd.merge(grouped_df, apple_data, how='inner', left_on='date', right_on='date')# Calculate correlation between 'Compound Score' and 'Close'correlation = full_merged_data['Compound Score'].corr(full_merged_data['Close'])#print(f"Correlation between Compound Score and Closing Price: {correlation}")#print(full_merged_data)# Plotting# Reverse the interpretation of sentiment scoresfull_merged_data['Reversed Compound Score'] =-full_merged_data['Compound Score']# Calculate correlation between 'Reversed Compound Score' and 'Close'reversed_correlation = full_merged_data['Reversed Compound Score'].corr(full_merged_data['Close'])print(f"Correlation between Compound Score and Closing Price: {reversed_correlation}")# Create subplots with two y-axesfig = make_subplots(specs=[[{"secondary_y": True}]])fig.update_layout( title='Global Factors Sentiment Scores vs. Apple Closing Price', xaxis_title='Date', plot_bgcolor='white', # Set plot background color paper_bgcolor='white', # Set paper background color)# Identify increases and decreases in Closing Pricepositive_changes = full_merged_data['Close'].diff().gt(0)negative_changes = full_merged_data['Close'].diff().lt(0)# Add traces for the Reversed Compound Score, Closing Price, and changesfig.add_trace(go.Scatter(x=full_merged_data['date'], y=full_merged_data['Reversed Compound Score'], mode='lines+markers', name='Compound Score'), secondary_y=False)fig.add_trace(go.Scatter(x=full_merged_data['date'], y=full_merged_data['Close'], mode='lines+markers', name='Closing Price', line=dict(color='darkgoldenrod')), secondary_y=True)fig.add_trace(go.Scatter(x=full_merged_data['date'][positive_changes], y=full_merged_data['Close'][positive_changes], mode='markers', name='Positive Change', marker=dict(color='green', size=8)), secondary_y=True)fig.add_trace(go.Scatter(x=full_merged_data['date'][negative_changes], y=full_merged_data['Close'][negative_changes], mode='markers', name='Negative Change', marker=dict(color='red', size=8)), secondary_y=True)# Update y-axis labels and stylesfig.update_yaxes(title_text='Compound Score', secondary_y=False, color='blue', showline=True, linecolor='blue', linewidth=2)fig.update_yaxes(title_text='Closing Price', secondary_y=True, color='darkgoldenrod', showline=True, linecolor='darkgoldenrod', linewidth=2)fig.update_xaxes(showgrid=True, zeroline=True, gridcolor='lightgrey', gridwidth=0.5, showline=True, linecolor='#2a3f5f', linewidth=2) # Show minimal gridlines# Display the plotfig.show()
Correlation between Compound Score and Closing Price: 0.3345370286580724
Discussion
In this project, we delved into the dynamic intersection of finance and sentiment analysis, with a specific focus on Apple stock. Employing the SentimentIntensityAnalyzer function from the vaderSentiment library in Python, our exploration revolved around gauging the emotional tone surrounding Apple’s stock through the analysis of textual data. The initial phase involved data acquisition, leveraging the yfinance library to retrieve comprehensive stock data via API calls.
Following this, sentiment analysis was conducted using the robust vaderSentiment tool, providing a nuanced understanding of sentiment intensity in the context of textual information. The obtained sentiment scores were then juxtaposed against Apple’s stock prices, resulting in a visual representation that illuminated the intricate relationship between sentiment dynamics and market trends.
The comparison plot unearthed compelling insights into the potential correlation between sentiment shifts and stock price movements. Peaks and troughs in sentiment scores aligned with notable fluctuations in the stock chart, suggesting a discernible influence of public sentiment on market dynamics. However, it is crucial to recognize the multifaceted nature of stock markets and the complex interplay of various factors impacting stock prices. While sentiment analysis offers valuable insights, it should be considered as one element within the broader landscape of financial analysis.
In conclusion, this project exemplifies the integration of sentiment analysis techniques with financial data to derive meaningful insights into the emotional undercurrents surrounding Apple stock. As we navigate the complexities of stock markets, this approach introduces a nuanced layer to our comprehension, laying the groundwork for continued exploration and refinement of strategies in financial decision-making.