About Me

Profile Image

CS PhD Candidate @ Sapienza

AI Scientist @ Outsampler

I am a PhD candidate in Computer Science at Sapienza University of Rome specializing in artificial intelligence for financial systems. My research focuses on causal modeling and generative approaches for financial time series, combining rigorous mathematical formulations with modern deep learning techniques.

Alongside my academic work, I serve as an AI Scientist at Outsampler, where I design and deploy language-model-driven solutions for fraud detection and automated financial report generation. My work includes building conversational AI for time-series analytics and translating natural language queries into structured financial data insights.

My broader experience spans causal discovery, generative modeling, time-series shock detection, distributed AI optimization, and NLP systems, with applications ranging from financial markets to networked systems.

Resume

Education

PhD in Computer Science
Sapienza, University of Rome
Oct 2022 - May 2026

Thesis: "Analysis and Synthetic Generation of Financial Time-Series"

MS in Computer Science
Sapienza, University of Rome
Oct 2020 - Oct 2022
Grade: 110/110 with honours

Thesis: "Adversarial Learning to Rank - Transferable Text-Based Attacks to Black-Box Neural Ranking Models: WARA and WSRA"

BS in Computer Science
Tor Vergata, University of Rome
Oct 2017 - Oct 2020
Grade: 110/110 with honours

Thesis: "Diffusion in the Presence of Ambivalent relationships: The Role of the Negative relationships in the complexity of the Problem"

Experience

AI Scientist
Outsampler
July 2025 - Present

Building conversational agents for time-series in the context of fraud detection and financial reports generation.

Skills

Programming Languages
Python C C++ Java
Backend & Frontend
SQL MongoDB REST APIs (FastAPI) React
Deployment & Cloud
Docker Slurm Google Cloud Platform
ML & Data
(Deep) Machine Learning PyTorch NLP & LLM serving (vLLM, SGLang) Time Series Causality

Publications

STOP! A Solution for Sustainable and Geo-Distributed AI Inference
V Arrigoni, G Masi, N Bartolini
Under review. [Preprint]
Robust Causal Discovery in Real-World Time Series with Power-Laws
M Tusoni, G Masi, A Coletta, A Glielmo, V Arrigoni, N Bartolini
Under review. [Preprint]
DiffCATS: Causally Associated Time-Series Generation through Diffusion Models
G Masi, A Coletta, E Fons, S Vyetrenko, N Bartolini
Transactions on Machine Learning Research (2026) [Link]
Patrolling Heterogeneous Targets with FANETs
N Bartolini, G Masi, M Prata, F Trombetti
IEEE INFOCOM 2024-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) [Link]
Lob-based Deep Learning Models for Stock Price Trend Prediction: A Benchmark Study
M Prata, G Masi, L Berti, V Arrigoni, A Coletta, I Cannistraci, S Vyetrenko, P Velardi, N Bartolini
Artificial Intelligence Review [Link]
On Correlated Stock Market Time Series Generation
G Masi, M Prata, M Conti, N Bartolini, S Vyetrenko
Proceedings of the Fourth ACM International Conference on AI in Finance, 524-532 [Link]
Stock Shocks Modelling and Forecasting
V Arrigoni, G Masi, E Mercanti, N Bartolini, S Vyetrenko
2023 IEEE 43rd International Conference on Distributed Computing Systems Workshops (ICDCSW) [Link]

Projects

Causal Discovery

PLaCy

Robust causal discovery method for stochastic time series leveraging power-law spectral features. Exploits the inherent power-law distribution in real-world time series frequency spectra to amplify genuine causal signals and reduce noise sensitivity, outperforming state-of-the-art alternatives on synthetic and real-world datasets.

Causal Discovery

DiffCATS

A diffusion model called DiffCATS to generates multiple causally associated time-series as well as a ground truth causal graph that reflects their mutual temporal dependencies, requiring only observational time-series data for training.

Time-series Forecasting

LOBCAST

Open-source Python framework for Stock Price Trend Prediction (SPTP) standardization, implementing data preprocessing, deep learning model training, evaluation, and profit analysis. Benchmarks fifteen state-of-the-art deep learning models on Limit Order Book data, examining robustness and generalizability.

Time-series Generation

CoMeTS-GAN

Correlated Multivariate Time Series generative framework based on Conditional Generative Adversarial Networks (C-GANs) designed to generate price and volume time series of correlated stocks. Accurately learns and reproduces stylised facts and inter-asset correlations, crucial for achieving realism in multi-stock simulation environments.

Shock detection

Stock Shocks Modelling and Forecasting

Formal definition of stock shocks based on fat-tailed Lévy-stable distributions. Implemented forecasting algorithms using Limit Order-Book data with machine learning approaches (random forest, hierarchical clustering) achieving high precision and recall.

Patrolling

Patrolling

Patrolling is a drone-based surveillance system that uses computer vision and deep learning to monitor and secure large areas. The system can detect intruders, recognize faces, and provide real-time alerts to security personnel.

NLP

Natural Language Processing

Word-in-Context

Aspect-based Sentiment Analysis

Word Sense Disambiguation

Blockchain

eTreeum

eTreeum was created to help raise awareness on the environmental impact of blockchain technologies. Ideally, users of this app will help plant trees in the real world by playing with crypto-trees. Users are able to start with a free seed, take care of it and then sell it for cryptocurrencies.