I can give you a complete analytical procedure and example code, but I cannot directly access or extract posts from your Bluesky feed without the data itself. Below is a precise workflow you can apply once you have the post-level metrics (e.g., timestamps and like counts).
1. Data Requirements
Prepare a dataset containing at least:
post_idcreated_at(timestamp)like_count
Filter the dataset to only include posts from the last 7 days.
2. Compute the Baseline Trend
If you want outliers relative to a trend (rather than a flat mean), you need to model the expected likes per post. Typical approaches:
A. Linear trend:
Fit:
like_count = β0 + β1 * time_index
B. Rolling mean trend: Compute a rolling average (e.g., 24-hour or N-post window).
C. LOESS smoothing: Provides a smooth non-parametric trend.
For business analytics, the linear model or LOESS approach is most standard.
3. Compute Residuals
Residual = actual_like_count – expected_like_count_from_trend.
Then compute:
- Mean of residuals (
μ) - Standard deviation of residuals (
σ)
4. Z-Score
z = (residual – μ) / σ
Flag outliers where:
z >= +2
This captures posts whose likes are significantly above expectation relative to both trend and overall variance.
5. Example Python (pseudocode with pandas + statsmodels)
import pandas as pd
import numpy as np
import statsmodels.api as sm
# df columns: post_id, created_at, like_count
df['created_at'] = pd.to_datetime(df['created_at'])
# Filter last 7 days
cutoff = df['created_at'].max() - pd.Timedelta(days=7)
df_week = df[df['created_at'] >= cutoff].copy()
# Trend model: linear regression on time index
df_week = df_week.sort_values('created_at')
df_week['t'] = (df_week['created_at'] - df_week['created_at'].min()).dt.total_seconds()
X = sm.add_constant(df_week['t'])
model = sm.OLS(df_week['like_count'], X).fit()
df_week['expected'] = model.predict(X)
# Residuals and z-score
df_week['residual'] = df_week['like_count'] - df_week['expected']
mu = df_week['residual'].mean()
sigma = df_week['residual'].std()
df_week['z'] = (df_week['residual'] - mu) / sigma
# Outliers: z ≥ +2
outliers = df_week[df_week['z'] >= 2]
6. Output
The outliers DataFrame contains the set of posts whose like performance is significantly above trend using the +2σ criterion.
Here are several open-source GitHub projects that implement time-series or outlier / anomaly detection — you can adapt them to detect “posts with likes >> expected trend” on a feed. I grouped them by suitability for your use (simple time-series, streaming, advanced / ML).
✅ Good GitHub projects for outlier detection in time series / counts (e.g. likes)
In addition — for a broader survey / catalogue rather than a single tool — awesome‑TS‑anomaly‑detection provides a curated list of many libraries, datasets, and resources; comes in handy if you want to explore multiple methods to find the one that works best. (GitHub)
🔎 Which to pick for “post-likes outlier” detection and why