Stock prediction using topic modeling and sentiment analysis techniques: A machine learning case study on Ecopetrol

Autores/as

  • Edwar David Valenzuela Cortés Universidad Nacional de Colombia
  • Santiago Puente Núñez Universidad Nacional de Colombia

Palabras clave:

stock market prediction, topic modeling, sentiment analysis, Machine Learning, investment strategies

Resumen

This study introduces a novel technique for predicting market movements using topic and sentiment analysis of financial news about Ecopetrol. News headlines from Hydrocarbons and La República (July 2012 to December 2023) were analyzed using BER Topic, FinBERT, and Vader. The findings show that predictive models based on news headlines are more effective over 3 and 4-week periods compared to shorter periods. The Gradient Boosting model for week 3 achieved a profitability of 49.4% and accuracy of 57%, while a Random Forest model for week 4 yielded a profitability of 33.11% with a 9.71% error, outperforming the buy and hold strategy. These results highlight the advantage of short-term trend predictions in financial decision-making.

Biografía del autor/a

Edwar David Valenzuela Cortés, Universidad Nacional de Colombia

Estudiante de Administración de Empresas de la Universidad Nacional de Colombia sede Bogotá.

Santiago Puente Núñez, Universidad Nacional de Colombia

Estudiante de Economía de la Universidad Nacional de Colombia sede Bogotá.

Citas

Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. https://arxiv.org/abs/1908.10063

Balaneji, F., & Maringer, D. (2022). Applying sentiment analysis, topic modeling, and xgboost to classify implied volatility. 2022 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr), 1–8. https://doi.org/10.1109/CIFEr52523.2022.9776196

Chen, W., Rabhi, F., Liao, W., & Al-Qudah, I. (2023). Leveraging state-of-the-art topic modeling for news impact analysis on financial markets: A comparative study. Electronics, 12(12), 2605. https://doi.org/10.3390/electronics12122605

Correia, F., Madureira, A., & Bernardino, J. (2022). Deep neural networks applied to stock market sentiment analysis. Sensors, 22(12), 4409. https://doi.org/10.3390/s22124409

García-Méndez, S., de Arriba-Pérez, F., Barros-Vila, A., González-Castaño, F. J., & Costa-Montenegro, E. (2023). Automatic detection of relevant information, predictions and forecasts in financial news through topic modelling with latent dirichlet allocation. Applied Intelligence, 53(16), 19610–19628. https://doi.org/10.1007/s10489-023-04452-4

Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. https://arxiv.org/abs/2203.05794

Iguarán Cotes, J. (2019). Aplicación de redes neuronales para predecir el precio de acciones en la bolsa colombiana. http://hdl.handle.net/1992/44483

Jing, N., Wu, Z., & Wang, H. (2021). A hybrid model integrating deep learning with investor sentiment analysis for stock price prediction. Expert Systems with Applications, 178, Article 115019. https://doi.org/10.1016/j.eswa.2021.115019

Khan, W., Ghazanfar, M. A., Azam, M. A., Karami, A., Alyoubi, K. H., & Alfakeeh, A. S. (2022). Stock market prediction using machine learning classifiers and social media, news. Journal of Ambient Intelligence and Humanized Computing, 13(7), 3433–3456. https://doi.org/10.1007/s12652-020-01839-w

López-Gaviria, J. I. (2019). Predictibilidad del mercado accionario colombiano. Lecturas De Economía, (91), 117–150. https://doi.org/10.17533/udea.le.n91a04

Maqbool, J., Aggarwal, P., Kaur, R., Mittal, A., & Ganaie, I. A. (2023). Stock prediction by integrating sentiment scores of financial news and mlp-regressor: A machine learning approach. Procedia Computer Science, 218, 1067–1078. https://doi.org/10.1016/j.procs.2023.01.086

Monroy-Perdomo, L., Cardozo-Munar, C. E., Torres-Hernández, A. M., Tena-Galeano, J. L., & López-Rodríguez, C. E. (2022). Formalization of a new stock trend prediction methodology based on the sector price book value for the colombian market. Heliyon, 8(4), Article e09210. https://doi.org/10.1016/j.heliyon.2022.e09210

Palacio Roldan, J. (2022). Modelo de recomendación para inversión en acciones colombianas pertenecientes al índice colcap basado en análisis técnico y sentimiento del mercado local.

Shah, D., Isah, H., & Zulkernine, F. (2019). Stock market analysis: A review and taxonomy of prediction techniques. International Journal of Financial Studies, 7(2), 1–22. https://doi.org/10.3390/ijfs7020026

Wang, L., Huang, C., Gao, C., Ma, W., & Vosoughi, S. (2023). Joint latent topic discovery and expectation modeling for financial markets. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13937, 45–57. https://doi.org/10.1007/978-3-031-33380-4_4

Publicado

2025-07-07

Cómo citar

Valenzuela Cortés, E. D., & Puente Núñez, S. (2025). Stock prediction using topic modeling and sentiment analysis techniques: A machine learning case study on Ecopetrol. Revista Intercambio, (8), 89-124. Recuperado a partir de http://168.176.97.103/ojs/index.php/intercambio/article/view/666

Número

Sección

Artículos