|
40 | 40 | title: Using Natural Language Processing for Economic Policy Analysis |
41 | 41 | --- |
42 | 42 |
|
43 | | -# Using Natural Language Processing for Economic Policy Analysis |
| 43 | +## Using Natural Language Processing for Economic Policy Analysis |
44 | 44 |
|
45 | 45 | Natural Language Processing (NLP) is redefining how economists, policymakers, and data scientists interpret and analyze unstructured text data. In an era where vast quantities of political speeches, legislative texts, central bank statements, and government reports are published daily, NLP provides scalable, automated means to extract insights that once required intensive manual review. |
46 | 46 |
|
@@ -109,3 +109,76 @@ Overcoming these issues requires careful model selection, human-in-the-loop vali |
109 | 109 | NLP is a powerful ally in the realm of economic policy analysis. By transforming qualitative political and governmental text into structured, analyzable data, it enhances our ability to detect policy trends, forecast outcomes, and hold decision-makers accountable. |
110 | 110 |
|
111 | 111 | As models continue to evolve and become more interpretable, we can expect even deeper integration of NLP into the economic policymaking and analysis process—bridging the gap between language and action in the world of public economics. |
| 112 | + |
| 113 | +## Appendix: NLP Example for Economic Policy Analysis Using Political Speeches |
| 114 | + |
| 115 | +```python |
| 116 | +import pandas as pd |
| 117 | +import numpy as np |
| 118 | +import matplotlib.pyplot as plt |
| 119 | +from sklearn.feature_extraction.text import TfidfVectorizer |
| 120 | +from sklearn.decomposition import LatentDirichletAllocation |
| 121 | +from sklearn.pipeline import make_pipeline |
| 122 | +from sklearn.linear_model import LogisticRegression |
| 123 | +from sklearn.model_selection import train_test_split |
| 124 | +import seaborn as sns |
| 125 | + |
| 126 | +# Example corpus: Simulated economic policy speeches |
| 127 | +documents = [ |
| 128 | + "We must focus on reducing inflation and stabilizing interest rates.", |
| 129 | + "Investing in healthcare and education is vital to long-term growth.", |
| 130 | + "Tax cuts will boost consumer spending and revive the economy.", |
| 131 | + "Our plan includes raising the minimum wage and improving labor rights.", |
| 132 | + "We propose deregulating markets to increase economic efficiency.", |
| 133 | + "Stronger regulations on banks will prevent financial crises.", |
| 134 | + "We aim to decrease the fiscal deficit while maintaining social programs.", |
| 135 | + "Public infrastructure investment will stimulate employment.", |
| 136 | + "Monetary tightening is necessary to prevent overheating of the economy.", |
| 137 | + "Support for small businesses and innovation is key to competitiveness." |
| 138 | +] |
| 139 | + |
| 140 | +# Step 1: TF-IDF Vectorization |
| 141 | +vectorizer = TfidfVectorizer(stop_words='english', max_features=100) |
| 142 | +X_tfidf = vectorizer.fit_transform(documents) |
| 143 | + |
| 144 | +# Step 2: Topic Modeling with LDA |
| 145 | +lda = LatentDirichletAllocation(n_components=3, random_state=42) |
| 146 | +lda_topics = lda.fit_transform(X_tfidf) |
| 147 | + |
| 148 | +# Display top keywords for each topic |
| 149 | +def display_topics(model, feature_names, n_top_words): |
| 150 | + for topic_idx, topic in enumerate(model.components_): |
| 151 | + print(f"\nTopic {topic_idx + 1}:") |
| 152 | + print(" | ".join([feature_names[i] for i in topic.argsort()[:-n_top_words - 1:-1]])) |
| 153 | + |
| 154 | +feature_names = vectorizer.get_feature_names_out() |
| 155 | +display_topics(lda, feature_names, 5) |
| 156 | + |
| 157 | +# Step 3: Visualizing Document-Topic Distributions |
| 158 | +topic_df = pd.DataFrame(lda_topics, columns=[f"Topic {i+1}" for i in range(lda.n_components)]) |
| 159 | +topic_df['Document'] = [f"Speech {i+1}" for i in range(len(documents))] |
| 160 | + |
| 161 | +plt.figure(figsize=(10, 6)) |
| 162 | +topic_df.set_index('Document').plot(kind='bar', stacked=True, colormap='tab20c') |
| 163 | +plt.title("Topic Distribution Across Speeches") |
| 164 | +plt.ylabel("Proportion") |
| 165 | +plt.tight_layout() |
| 166 | +plt.show() |
| 167 | + |
| 168 | +# Optional: Sentiment Analysis Example with TextBlob (if available) |
| 169 | +try: |
| 170 | + from textblob import TextBlob |
| 171 | + sentiments = [TextBlob(doc).sentiment.polarity for doc in documents] |
| 172 | + sentiment_df = pd.DataFrame({'Speech': [f"Speech {i+1}" for i in range(len(documents))], |
| 173 | + 'Sentiment': sentiments}) |
| 174 | + |
| 175 | + plt.figure(figsize=(8, 5)) |
| 176 | + sns.barplot(data=sentiment_df, x='Speech', y='Sentiment', palette='coolwarm') |
| 177 | + plt.title("Sentiment Scores of Political Speeches") |
| 178 | + plt.axhline(0, color='gray', linestyle='--') |
| 179 | + plt.xticks(rotation=45) |
| 180 | + plt.tight_layout() |
| 181 | + plt.show() |
| 182 | +except ImportError: |
| 183 | + print("Optional: Install TextBlob for sentiment analysis (pip install textblob)") |
| 184 | +``` |
0 commit comments