The SELJI Method: How We Score Products

At SELJI.com, every recommendation is grounded in measurable data — not marketing claims. Where possible, we conduct hands-on testing and direct product reviews to validate real-world performance. Our proprietary review system then combines these results with AI, Natural Language Processing (NLP), and the latest Amazon Product Advertising API 5.0 to transform raw consumer feedback and product metrics into transparent, evidence-based product scores that shoppers can trust.


🧩 1. Data Sources — Where Our Insights Begin

We aggregate and validate data from multiple independent, verifiable channels:

  • Amazon Product Advertising API 5.0 — Real-time access to official Amazon product data, including pricing history, technical specifications, verified-review metadata, and stock or variant identifiers.
    • Version 5.0 ensures greater accuracy in product categorization, price tracking, and attribute consistency across regions.
  • Verified user reviews from major marketplaces and forums (Amazon, Best Buy, Walmart, Reddit).
  • Expert & lab testing reports from independent evaluators.
  • Manufacturer-supplied specs and firmware-update logs.
  • SELJI in-house testing data where applicable.

By combining structured data (API 5.0) with unstructured human feedback, we achieve both breadth and precision in analysis.


🧹 2. Data Cleaning & Normalization

Before any scoring occurs, all inputs undergo a rigorous cleaning process:

  • Deduplication & Canonical Mapping — Merge duplicates across SKUs, regions, and rebrands.
  • Fraud Detection — Identify synthetic or incentivized reviews through linguistic and temporal anomaly detection.
  • Unit Normalization — Convert disparate metrics (Pa, dB, mAh, Wh) into standardized units.
  • Version Control Alignment — Tag every dataset with firmware build numbers or release identifiers to ensure fair, time-consistent comparisons.

Only verified and time-stamped data proceeds into the scoring pipeline.


🧠 3. Feature & Sentiment Extraction

Our NLP pipeline isolates meaningful performance signals from millions of words of user feedback:

  • Aspect-based sentiment analysis (e.g., suction strength, battery life, noise level).
  • Negation & contrast detection (“not quiet,” “better than before update”).
  • Weighted trust modeling — verified buyers and highly rated reviewers carry more influence.
  • Topic clustering to uncover recurring reliability or usability patterns.

This creates a multidimensional feature map describing what users genuinely experience.


⚙️ 4. Scoring Model Architecture

Each product is evaluated through a Category-Specific Scoring Matrix, combining quantitative metrics, sentiment polarity, and confidence intervals.

Example – Smart Vacuum Category

PillarWeightDescription
Cleaning Efficiency0.28Performance across mixed surfaces
Automation & Maintenance0.18Docking, self-cleaning, refill systems
Navigation Intelligence0.16AI object avoidance & mapping precision
Reliability & Noise0.14Durability and acoustic performance
App / Firmware Stability0.10Connectivity and software reliability
Cost Efficiency0.14Price-to-performance and maintenance costs

Composite Score = Σ (PillarScore × Weight) ± ConfidenceInterval

Confidence intervals are computed through bootstrap resampling to reflect data stability and reviewer variance.


🧮 5. Firmware as a Living Variable

Many modern products are software-driven devices whose behavior evolves through firmware updates.
In SELJI’s system, firmware is treated as a dynamic performance factor, not a static specification.

How Firmware Influences Scoring

  1. Performance Evolution — Each firmware build may alter suction, navigation, or power efficiency. We detect these shifts through time-based sentiment changes and API 5.0 metadata.
  2. Reliability Tracking — Firmware update cadence, regression frequency, and fix latency feed into our Product Stability Index (PSI). Brands that deliver consistent, stable updates earn higher reliability weight.
  3. Version Normalization — Reviews and metrics are aligned to the latest stable firmware. Older data tied to outdated builds is down-weighted to prevent obsolete flaws from skewing results.
  4. Firmware Confidence Multiplier (FCM) — Each product receives a multiplier between 0.8 and 1.1 based on update quality: AdjustedScore = BaseScore × FCM Stable, improvement-oriented firmware pushes scores upward; erratic or regressive updates reduce them.

By integrating firmware behavior, SELJI’s scores always reflect the current real-world performance, not the launch-day snapshot.


📊 6. Cost & Longevity Analysis

We quantify long-term value using both static specs and dynamic consumption data:

  • Total Cost of Ownership (TCO) — Consumables, energy, and accessory costs projected over 12–36 months.
  • Price Trajectory & Promo Frequency — Pulled directly via Amazon API 5.0 for precise MSRP trends.
  • Warranty and Return Rates — Derived from aggregated customer service data and sentiment.

Every recommendation reflects durability, affordability, and lifecycle value.


🔬 7. Reliability & Durability Modeling

Reliability is modeled statistically through our Product Stability Index (PSI):

  • Survival-curve analysis on defect and return mentions.
  • Trend detection in post-update complaints.
  • Weighted historical variance to measure consistency across review periods.
  • Integration of firmware event markers to isolate software-related performance changes from hardware failures.

The PSI ensures each product’s reliability score mirrors its real trajectory over time.


🧪 8. Hands-On Validation

Whenever possible, SELJI performs its own testing to ground the data model in physical reality:

  • Noise Testing: Calibrated dBA measurements in controlled environments.
  • Cleaning Efficiency: Standardized debris compositions across surface types.
  • Navigation Tests: Timed obstacle courses to evaluate mapping precision.
  • Durability Cycles: Simulated long-term use to measure suction and battery decay.

Measured results feed back into the database to refine scoring accuracy.


📈 9. Ranking Transparency & Governance

  • No Paid Placement — Affiliate relationships never alter scores.
  • Dynamic Re-Weighting — Continuous data ingestion ensures rankings evolve with reality.
  • Audit Trail — Every score change is logged with cause and timestamp.
  • Reproducibility — Running the same dataset through the same version yields identical results.

🔍 10. Why It Matters

Most review sites summarize opinions; SELJI quantifies them.
By combining live API data, NLP-based sentiment modeling, firmware tracking, and human-verified testing, we turn the chaos of online reviews into clear, defensible evidence — empowering shoppers to make confident, data-backed choices.