I am building a production-grade recommendation system for a short-video platform (processing around 50k videos). The architecture utilizes a vector database (Qdrant) to store and query 768-dimensional video embeddings generated by a VideoCLIP model.
To track user preferences in real-time, I implement an online learning mechanism that updates a single user_vector iteratively after each interaction based on a computed rating (bounded between [-1.0, 1.0] via tanh).
The Goal & The Problem
I want my system to actively update the user vector on both positive and negative signals.
Initially, I tried a standard linear combination:
updated_vector = (alpha * u) + (beta * v * rating)
Where u is the current user vector, v is the video vector, alpha is the decay, and beta is the learning rate.
However, when a user gives a negative rating (e.g., -0.8), multiplying the VideoCLIP embedding by a negative scalar flips its direction entirely. In a 768-d multimodal space, adding this inverted vector creates massive noise across unrelated dimensions, causing aggressive vector drift instead of just moving away from that specific topic.
On the other hand, simply clamping negative ratings to 0 fixes the geometry but creates a severe feedback loop/frozen vector issue where the profile stops evolving during consecutive negative interactions.
What I Want to Achieve
I need a mathematically sound way to update the vector during negative interactions so that the user profile actively flees from the disliked content's direction, without destroying the high-dimensional vector space geometry.
Here is the current skeleton I want to optimize, where the negative branch needs a proper geometric subtraction mechanism:
import numpy as np
from typing import List
def update_user_embedding(
current_vector: List[float],
video_vector: List[float],
rating: float,
learning_rate_b: float,
interaction_count: int = 0,
) -> List[float]:
u = np.array(current_vector, dtype=np.float32)
v = np.array(video_vector, dtype=np.float32)
# Adaptive learning rate: fast at start, stabilizes near learning_rate_b
adaptive_lr = max(learning_rate_b, 1.0 / (1.0 + interaction_count))
if rating >= 0.0:
# Positive feedback: Standard linear combination
effective_lr = adaptive_lr * rating
adaptive_decay = 1.0 - effective_lr
updated_vector = (adaptive_decay * u) + (effective_lr * v)
else:
# Negative feedback: Need to actively update and move away
# WITHOUT multiplying 'v' by a negative scalar directly.
# TODO: Implement a safe subtraction/projection mechanism here
pass
# Normalize vector (L2 Norm)
norm = np.linalg.norm(updated_vector)
if norm > 0:
updated_vector = updated_vector / norm
return updated_vector.tolist()