Data Scientist, NLP — Text Homogeneity & Anomaly Detection

Upwork

Remoto

•

2 hours ago

•

No application

About

Looking for a skilled data scientist with NLP experience to analyze a confidential CSV containing ~2,000 short text entries. The goal is to determine whether there are unusual patterns, for example unusual repetition, short/generic wording, or timing anomalies - looking for unnatural records within what should be a natural set of records. Responsibilities - Clean and preprocess the dataset, handle unicode and language noise. - Run exploratory analysis, including length, lexical diversity, and sentiment proxies. - Create embeddings and run clustering to surface homogeneous groups. - Apply anomaly detection and temporal analysis to spot suspicious bursts. - Produce visualizations that clearly explain findings to a non-technical audience. - Deliver a reproducible Jupyter notebook or Python scripts, plus a short written summary. Required skills - Strong Python, pandas, scikit-learn experience. - Practical NLP experience with spaCy, HuggingFace, or similar. - Familiarity with embeddings, clustering (DBSCAN, k-means), and anomaly detection (Isolation Forest, LOF). - Experience creating clear charts and concise writeups. - Good communicator, able to explain methods and limitations. Nice to have - Prior work on detecting repetitive or coordinated text. - Stylometry or forensic linguistics exposure. - Experience comparing datasets to public fake-review benchmarks. Deliverables (suggested) - Jupyter notebook or scripts with commented code. - Visuals: rating distribution, text length boxplot, cluster map, timeline of suspicious activity. - Short report summarizing methodology, key signals, and a reasoned likelihood estimate of manipulation. - Brief recommendations for next steps. Timeline - Estimated scope 15–25 hours.

Remove Ads

Similar Positions

HR Assistant

Conttab

Australia

We are seeking a highly organized, proactive, and communicative

8 minutes ago

Technical Business Analyst

Asx

Sydney, NSW

ASX is one of the world’s top ten exchanges. As a full-service exch...

8 minutes ago

Global Head of Partner Network

Corporate Travel Management (ctm) Au/nz

Greater Sydney Area

Corporate Travel Management (CTM) is an award-winning provider of i...

8 minutes ago

Environmental Consultant

Rps

Cairns, QLD

Founded in 1970, RPS is part of Tetra Tech, a leading global provid...

8 minutes ago

Executive Assistant

Wilmar Sugar Australia Limited

Townsville, QLD

Wilmar Sugar Australia is the nation’s largest sugar producer and a...

9 minutes ago

Get our app today

Data Scientist, NLP — Text Homogeneity & Anomaly Detection