Whether you're archiving before a migration, satisfying a retention obligation, or building the evaluation case for a platform change, at some point you need your data out of Archer — all of it, in a form you can actually use. Archer provides several export surfaces, each built for a different purpose, and none of them individually designed for "give me everything." This guide covers each method at practitioner level: what it exports, how, and where it falls short.
One framing note before the mechanics: exporting records is the easy 70%. The hard 30% — values lists, cross-references, leveled records, attachments, and history — is where extraction projects actually stall, and it's covered honestly in the second half of this guide.
The export surfaces Archer actually provides
Archer gives you five practical routes to your data, and a complete extraction typically uses several of them in combination:
- Search exports — CSV (and similar formats) from any application's search results, driven through the UI.
- Report exports — saved reports exported to CSV/Excel, useful for repeatable, pre-shaped pulls.
- The REST API — programmatic record retrieval, the backbone of any automated extraction.
- The older Web Services (SOAP) API — still present, still needed for a few operations the REST API doesn't cover cleanly, especially in older instances.
- Data feeds — Archer's integration engine, usually thought of for inbound data but usable outbound in some configurations.
Attachments — the documents, evidence files, and images attached to records — sit awkwardly across these surfaces and get their own section below, because they're consistently the most underestimated part.
Search and report exports: the accessible 70%
How it works. Open any application, run a search (or open a saved report), display the fields you need, and export the results — CSV for data work, Excel for humans. Anyone with appropriate access can do this without touching an API, which makes it the natural starting point.
What it's good for. Point-in-time snapshots of individual applications; archival copies of record data; quick record counts and field census work during discovery. If your goal is a defensible archive of what each application contained, systematic search exports per application get you most of the way.
Where it falls short. Four places, in practice:
- You must configure the export per application. The export contains the fields you chose to display, not the fields that exist. Fields nobody added to the search view silently don't ship. A complete export requires deliberately building a "all fields" view per application — tedious across dozens of applications, and easy to get subtly wrong.
- Cross-references flatten. A cross-reference field exports as display text — the related record's title, typically — not as a key you can reliably re-join on. The relationship graph that makes your Archer data meaningful is not in the CSV.
- Leveled applications export awkwardly. Applications with parent/child levels don't produce a single clean flat file; you're exporting per level and reconstructing the hierarchy yourself.
- No attachments, no history. The CSV has a column saying an attachment exists; it does not contain the file. Field-level audit history isn't in scope for a search export at all.
The REST API and the Web Services API
The REST API is the primary programmatic route and the backbone of any automated extraction. The shape of the work: authenticate to obtain a session token, enumerate applications and their field definitions via the metadata endpoints, then page through records per application. Pagination is the operational constraint everyone learns quickly — content retrieval pages in blocks (commonly 1,000 records), so any real application requires looping with page tracking, retry handling, and rate discipline. Budget real engineering time for a robust extraction script, not an afternoon.
What the REST API gives you that the UI doesn't: field-level data addressed by field ID rather than display name (which survives label changes), values-list internal IDs alongside display values, systematic access to every application without hand-building search views, and repeatability — you can rerun the extraction and diff the results.
The Web Services (SOAP) API predates the REST API and remains relevant for two reasons: older instances sometimes have configurations or modules where the SOAP surface exposes things the REST API doesn't cover cleanly, and some long-standing integration scripts in the wild are built on it. If you inherit extraction tooling from a previous admin, expect to find SOAP calls in it. For new extraction work, start REST-first and drop to SOAP only where you hit a gap.
Where the APIs fall short. The APIs give you data with fidelity, but they don't give you meaning: what ships is field IDs, values-list IDs, and record IDs. Turning that into a usable dataset requires the metadata mapping work described in the "complete extraction" section — and API access itself requires credentials and permissions that some organizations' Archer teams control tightly. Plan the access conversation early.
Data feeds
Archer's data feed engine is primarily designed to bring data in, but it's worth knowing for extraction planning for one reason: instances that have been running for years often already have outbound feeds — to a data warehouse, a ticketing system, a reporting database. If an outbound feed to a warehouse exists, your extraction may be partially done already, and the warehouse copy may be easier to work with than Archer itself. Check with whoever owns your BI stack before building extraction tooling from scratch.
The caveat: existing feeds carry the fields someone configured them to carry, at the fidelity they needed at the time. Treat a warehouse copy as a head start to verify, not a complete extraction to trust.
Attachments: the retrieval reality
Attachments are where extraction estimates go to die, so here is the honest picture. Files attached to Archer records — evidence documents, policies, screenshots, signed approvals — are not part of any record export. Retrieving them means either downloading them through the UI record by record (viable for dozens, absurd for thousands) or scripting retrieval through the API: enumerate the attachment fields per application, resolve the file IDs per record, and download each file individually, preserving the record association as you go — because a folder of 4,000 files with no mapping back to their records is an archive in name only.
Before you script anything, get a count. A values-per-field census (below) will tell you whether you have three hundred attachments or thirty thousand, and that number changes the plan more than any other single fact.
The parts that make Archer extraction genuinely hard
Five things, consistently:
- Values lists. Archer fields overwhelmingly resolve to values lists — the controlled vocabularies behind every dropdown. Exports give you either display text (which changes over time) or internal IDs (which mean nothing without the mapping). A faithful extraction captures the values lists themselves, including inactive values still referenced by old records.
- Cross-references. The relationships between records — risk to control, control to policy, finding to remediation — are the actual content of a mature GRC program, and they're the first thing lost in flat exports. Preserving them means exporting relationship data deliberately, via the API, with both sides' record IDs.
- Leveled records. Parent/child structures within applications need their hierarchy captured explicitly, or the export is a pile of rows that used to be a tree.
- Calculated fields. The values export; the formulas don't (they're configuration, not data). If downstream users will ask "how was this score computed," you need the formula definitions documented separately — while someone who understands them is still available.
- Audit history. Field-level change history is the hardest surface to extract comprehensively, and for some retention regimes it's also the one you're obligated to keep. Where full history extraction isn't practical, a documented, dated archive of the history you could capture — plus your auditor's sign-off on the approach — beats a silent gap.
What a complete extraction actually requires
Pulling the threads together — a complete, defensible Archer extraction is a small project with a discovery phase, not a big download. In sequence:
- Application inventory. Enumerate every application in the instance — including the dormant ones — with record counts per application. This is your scope document; everything else keys off it. (Our Archer Inventory Worksheet is a structured template for exactly this.)
- Field census. Per application, enumerate every field with its type — text, values list, cross-reference, calculated, attachment, sub-form. The census tells you which of the hard problems above you actually have, and at what volume.
- Values-list mapping. Export the values lists themselves and map internal IDs to display values, including inactive entries. This is what makes API exports readable a year from now.
- Record extraction. API-driven, paged, per application, addressed by field ID — with row counts reconciled against the inventory from step 1.
- Relationship extraction. Cross-reference data exported with both-side record IDs, so the graph survives.
- Attachment retrieval. Scripted, with record association preserved, sized by the census from step 2.
- Verification. Counts reconciled, spot-checks against the live system, and a written record of what was and wasn't captured — the document your auditor and your future self both want.
All of this is doable manually — teams do it every year, and this guide should be enough to plan it. It's also exactly the kind of systematic, repetitive, error-prone work that automation exists for: the discovery steps (inventory, census, values-list mapping) are deterministic reads of Archer's metadata surfaces, which is why we've built that discovery approach into the Migration Readiness Scan — currently in early access. Either way, do the discovery before you promise anyone a migration timeline: the census is what makes the estimate real.
If the extraction is the first step of a platform move, the Archer to Vanta migration guide covers the full sequence from here, and the legacy migration assessment will give you a complexity score for the whole undertaking in about five minutes.
Planning an extraction ahead of a migration?
A free 30-minute consultation maps your exact situation — what data moves, what doesn't, whether your timeline is viable, and what the switch will actually cost in time and disruption.
Independent advice. Not affiliated with any platform vendor.