IT Infrastructure
Relevant Skills
Published Date
Not the right fit for you? Sharing the opportunity with your network is a great form of advocacy!
Build Data Pipeline and Database for Gender-based Violence Dashboard
Background. TrackGBV analyzes court sentencing decisions in gender-based violence cases to expose patterns of judicial bias. Our dashboard currently pulls from an Excel sheet of ~2,600 manually-analyzed cases. We have now built an AI extraction pipeline that processes cases automatically, and a human quality-control workflow that produces corrected output. Both datasets need to live together in one place to feed the dashboard.
The project: Design and build a new CockroachDB (Postgres-compatible) table that serves as the unified source of truth for the dashboard, then build the pipeline that keeps it synchronized.
Deliverables, organized into three 4-week phases:
Phase 1 (Weeks 1-4): Scoping, schema design, and column mapping
- Review the two source schemas (existing manual-case table, AI-corrected output table) and the current Excel dashboard source
- Produce a column mapping document defining every field transformation (direct renames, value format conversions, derived fields, dropped fields), reviewed and signed off by ICAAD
- Design the schema for the new unified dashboard table, including provenance and audit fields (last_synced_at, last_edited_by, source)
Write the CockroachDB DDL
Phase 2 (Weeks 5-8): Migration and sync pipeline
- Build the one-time migration script that transforms the ~2,600 existing manual cases into the new table
- Build the ongoing sync pipeline that reads from the AI-corrected output table and writes into the new table, with overwrite semantics for updated rows
- Test against sample data, handle edge cases (missing values, format variations, multi-value fields)
Phase 3 (Weeks 9-12): Testing, documentation, and handoff
- Integration testing across the full pipeline
- Produce documentation for handoff: schema reference, transformation logic for each derived field (with formulas), operational runbook for the sync pipeline, and known edge cases
- This documentation directly enables a follow-on Taproot project for a data analyst to migrate the dashboard's data source from Excel to the new table
Commitment: 6-8 hours per week across 12 weeks (some weeks lighter, Phase 2 will run slightly heavier).
Skills needed:
- SQL and relational schema design (Postgres or Postgres-compatible; CockroachDB experience a plus but not required)
- Python for data transformation (pandas, database connectors)
- Comfort with ETL/sync pipeline patterns including audit fields and overwrite logic
- Clear technical writing for schema and pipeline documentation
- Attention to detail on value conventions and data consistency
Note on sensitive content: This project involves working with structured data from court sentencing decisions in gender-based violence cases. A trauma-informed approach is required, though the work itself is primarily data engineering rather than case review. Strong attention to data security and ethical handling of legal records is essential.
Right now, our dashboard runs on an Excel sheet. This limits how fast we can onboard new cases, how many jurisdictions we can cover, and how reliably the data can be maintained. The AI extraction pipeline we built over the last two years dramatically increases our processing capacity, but until AI-extracted cases can flow into the same data source the dashboard uses, the two halves of our system cannot work together.
This volunteer project is the bridge. By designing the unified database table and building the pipeline that keeps it synchronized, the volunteer enables ICAAD to:
- Scale TrackGBV from ~2,600 cases to tens of thousands across new jurisdictions, because new cases can be ingested without manual Excel updates
- Maintain dashboard accuracy, because human-corrected data flows automatically into the dashboard's data source with a full audit trail
- Onboard future technical contributors more easily, because the data architecture will be documented and database-native rather than spreadsheet-bound
- Unblock the follow-on project that migrates the dashboard itself to the new data source
For a volunteer interested in using their data engineering skills for human rights impact, this is foundational infrastructure work that directly enables evidence reaching courts, policymakers, and survivor communities.
- Documenting the three source schemas involved (manual-case table, AI extraction output, AI-corrected output) with sample data ready to share
- Defining the overall target architecture and pipeline requirements in advance, so the volunteer has a clear starting point rather than needing to discover requirements from scratch
- Preparing a detailed scoping document for Week 1 onboarding that covers context, architecture, the three source tables, and the decisions that need to be made during the project
- Identifying which dashboard fields are visualization-critical vs. verification-only, so the volunteer's scope is bounded
- Ensuring CockroachDB access, AWS credentials, and GitHub repository access can be provisioned on day one
- Structuring the 12 weeks into three 4-week phases with clear deliverables at each boundary
The volunteer will work directly with me (Director of Analytics and Justice Tech) for all decisions. Our legal team is available for domain context where needed.
International Center for Advocates Against Discrimination Inc.
Location
Remote, US-NY
Website
https://www.icaad.ngoMember Since
Oct 2021
Completed Taproot Plus Partnerships
0
Organization Mission
Program Focus Areas
See All
Opportunities
Project
Website development
Philanthropy & Capacity Building
The WildRoot Collective is seeking a web designer or developer to help us create a professional, user-friendly website that reflects our mission, programs, and community...
Posted
The WildRoot Collection
Project
Messaging
Education
We are searching for a volunteer who can lead our development of a new and improved mission and vision statement that will communicate our purpose and vision to a variety of...
Posted August 21, 2025
Heart-to-Heart
Project
Other
Community Development
As Girls in Gear continues to grow nationwide, we would like help establishing a corporate sponsorship strategy. Our goal is to obtain corporate sponsorships to support the...
Posted July 22, 2025
Girls in Gear