How we turned a static registration list into a privacy-first, self-healing verification system - one subtle decision at a time.
It started, as most things do, with a bug report and a CSV file.
Social Summer of Code Season 5 was scaling fast. Nearly three thousand contributors, mentors, and project admins had registered, and we needed a single place where any one of them could type in their email and know - instantly, privately, beautifully - whether they were in. No backend. No authentication. Just a static site, a JSON file, and a promise that we'd handle their data with care.
This is the story of how we built it.
I. The Ghost in the DOM
The first sign of trouble was an error that only appeared the second time you pressed the button.
Uncaught (in promise) NoModificationAllowedError:
Failed to set the 'outerHTML' property on 'Element':
This element has no parent node.
The check page had a satisfying interaction: when you clicked "Check," the search icon in the button would swap out for a spinning loader. The code was straightforward - replace the icon element's outerHTML with a spinner <span>. When the check completed, query the spinner back out and swap in a fresh icon.
The problem was invisible on the first run. But outerHTML doesn't just change an element - it destroys it. The original DOM node is ripped out and replaced with an entirely new one. Meanwhile, our cached reference el.btnIcon still pointed at the old, now-parentless ghost node. On the second click, JavaScript tried to set outerHTML on an orphan, and the browser rightfully refused.
The fix was almost embarrassingly small:
// Before: stale reference after first replacement
el.btnIcon.outerHTML = '<span class="spin btn-icon">...</span>';
// After: always query the live DOM
const icon = document.querySelector("#check-btn .btn-icon");
if (icon) icon.outerHTML = '<span class="spin btn-icon">...</span>';
One line of defense. Fresh reference every time. The ghost was gone.
But this small bug carried a larger lesson: in a system with no framework, no virtual DOM, no reconciliation layer - you are the memory manager. Every outerHTML replacement is a quiet severance. If you cache references, you must know when they die.
II. The Email Problem
With the UI stable, we turned to a harder question - one that had been sitting in plain sight inside registrations.json:
{ "email": "[email protected]", "name": "Priya Sharma", "role": "contributor" }
Three thousand email addresses. In a public JSON file. Served from a static site. Indexed by search engines.
This is the kind of thing that feels fine during development and becomes a liability the moment real people are involved. Email addresses are personally identifiable information. Exposing them in a publicly accessible file - even one that's "just" a JSON blob fetched by JavaScript - is a privacy risk that no amount of good intentions can paper over.
We needed a way to verify "Is this email registered?" without ever storing the email itself.
The answer was hashing. Specifically, MD5.
Why MD5?
Let's be precise: MD5 is not a secure hashing algorithm for passwords or cryptographic signatures. It has known collision vulnerabilities. But for our use case - creating a one-way fingerprint of an email address for lookup purposes - it's perfectly adequate. We're not protecting secrets; we're anonymizing identifiers. The threat model is "someone browsing the JSON file shouldn't be able to harvest a list of email addresses," and MD5 handles that cleanly.
SHA-256 would have worked too. We chose MD5 because it produces shorter hashes (32 hex characters vs. 64), keeps the JSON leaner, and aligns with the well-established convention of email hashing used by services like Gravatar.
The Implementation
The transformation was two-sided.
Server side (the Node.js build script), we used Node's built-in crypto module:
const crypto = require("crypto");
function md5(str) {
return crypto.createHash("md5").update(str).digest("hex");
}
Client side (the browser), there's no native MD5 support. The Web Crypto API offers SHA-1, SHA-256, and SHA-512 - but not MD5. So we embedded a compact, pure-JavaScript MD5 implementation directly in the page. It's roughly 60 lines of bit manipulation - the four classic rounds (FF, GG, HH, II), the sine-derived constants, the Merkle-Damgard construction - all collapsed into a self-contained function.
The lookup flow became:
User types email
|
v
toLowerCase()
|
v
md5js()
|
v
Map.get(hash)
|
Found? --> Show welcome message
Not found? --> Show "not registered" guidance
The JSON file went from this:
{ "email": "[email protected]", "name": "Priya Sharma", "role": "contributor" }
To this:
{ "emailHash": "b866b340e733ba27b3c3a671f43c2977", "name": "Priya Sharma", "role": "contributor" }
The email never leaves the user's browser. The JSON file contains only irreversible hashes. The verification still works. But now, even if someone downloads the entire registration list, they get nothing they can use.
III. Taming the CSV
The registration data arrived as a CSV export - 3,463 rows, 23 columns, and all the chaos that implies.
Quoted fields containing commas. Project descriptions spanning multiple lines. Duplicate entries where someone registered twice (or three times). Timestamps in DD/MM/YYYY, HH:MM:SS AM/PM format - a notation that JavaScript's Date constructor will silently misparse if you feed it in raw.
We wrote update-registrations.js - a zero-dependency Node.js script that serves as the single source of truth pipeline:
regs.csv --> parse --> deduplicate --> hash --> registrations.json
The Parser
We couldn't use a simple split(","). CSV files with quoted fields require a stateful parser that tracks whether it's inside a quoted region. Our parser walks the string character by character, handling:
- Quoted fields with embedded commas:
"React, Node.js, MongoDB" - Escaped quotes within fields:
"She said ""hello""" - Newlines inside quoted fields (the project descriptions were full of these)
Deduplication
Multiple registrations from the same email were common. The rule: keep the latest entry, determined by comparing parsed timestamps. This ensures that if someone updated their name or details in a second submission, we use the most recent version.
const existing = byEmail.get(email);
if (!existing || (ts && existing.ts && ts > existing.ts)) {
byEmail.set(email, { name, role, ts });
}
The Timestamp
Every timestamp was parsed into a proper Date object, and we tracked the global maximum. This became the lastUpdated field in the output JSON - not the time we ran the script, but the time of the most recent registration. A subtle but important distinction: it tells the user "our data includes registrations up to this moment," which is more meaningful than "we regenerated this file at this moment."
The final result: 2,919 unique registrations, deduplicated from 3,463 rows, with lastUpdated set to 2026-05-05T16:07:03.000Z.
Running It
node update-registrations.js # defaults to ./regs.csv
node update-registrations.js data.csv # or specify a path
One command. No dependencies. Deterministic output.
IV. Small Touches, Carried Far
The "Last Updated" Indicator
A timestamp buried in a JSON file is invisible. We surfaced it in the UI - a quiet line of text beneath the result area:
Last updated: May 5, 2026 at 09:37 PM
It appears only after the data loads (no layout shift, no flash of empty text), and it's styled to be present without competing: 12px, muted color, 70% opacity. It answers the question every anxious registrant has: "Is this data even current?"
The formatting uses the browser's toLocaleDateString and toLocaleTimeString with explicit options, so it adapts to the user's locale without us hardcoding a format.
The Work-in-Progress Banner
Season 5 was still being assembled. We needed a way to signal "this is real, but not final" - without making the site feel broken or untrustworthy.
A full-width banner sits above the sticky navigation on both pages:
[construction icon] Work in progress - some things may change before the official launch. [construction icon]
The gradient runs from SSoC's warm orange (#f97a4d) to its signature pink (#ec4899), tying it visually to the brand rather than making it feel like a generic warning. It's assertive enough to notice, calm enough not to alarm.
When the site is ready for launch, removing it is a one-line deletion in each HTML file.
V. Architecture of Absence
The most interesting thing about this system might be what it doesn't have.
No backend. The entire verification happens client-side against a static JSON file. This means zero server costs, zero cold starts, zero auth complexity, and deployment to any static host (GitHub Pages, Netlify, a bare S3 bucket).
No build step for the frontend. The HTML files are self-contained. Styles are in
<style>tags. Scripts are in<script>tags. There's no bundler, no transpiler, nonode_modulesfolder. You open the file in a browser and it works.No framework. Vanilla JavaScript, vanilla CSS, vanilla HTML. The total JS footprint - including the MD5 implementation, the CSV-to-JSON script, the UI interactions, and the confetti animation - is smaller than most framework boilerplate.
No stored PII in the public payload. Names are visible (they have to be, for the greeting), but emails are hashed. The only time an email exists in plaintext is in the user's own input field, in their own browser, on their own machine.
This is a deliberate architecture. Every missing piece is a surface area that doesn't need securing, a dependency that doesn't need updating, a service that doesn't need monitoring.
VI. What We Shipped
Let's be concrete about the final state:
| Component | Purpose |
|---|---|
index.html |
The onboarding almanac - checklist, resources, role guides |
check.html |
Email verification page with MD5-based privacy |
registrations.json |
Hashed registration data (2,919 entries, no plaintext emails) |
update-registrations.js |
CSV-to-JSON pipeline (zero dependencies, one command) |
regs.csv |
Source registration export |
The flow for maintainers is simple:
- Export the latest registrations as CSV
- Drop it in the directory as
regs.csv - Run
node update-registrations.js - Deploy
The flow for registrants is simpler:
- Open the check page
- Type your email
- Know
VII. Reflections
There's a tendency in engineering to reach for complexity - to add a database when a file will do, to add a framework when vanilla will do, to add a server when static will do. The SSoC Onboarding Almanac is a reminder that some of the most useful tools are the simplest ones, built with care.
The MD5 hashing doesn't make this a "security product." The CSV parser doesn't make this a "data pipeline." The DOM fix doesn't make this a "framework." But together, they form something coherent: a system that respects its users' data, handles its own edge cases, and does exactly one thing well.
Three thousand people will use this page to find out if they're in. Most of them will never think about the hashing, the deduplication, the ghost in the DOM. And that's exactly how it should be.
The best infrastructure is the kind you never notice.
Built for Social Summer of Code, Season 5. India's largest open-source program.
