What Happens During ACX Review

What Happens During ACX Review: The Complete Guide for Authors Who Want to Get It Right the First Time
Publishing an audiobook on Audible is not just about recording your voice or generating an AI narration. It is about passing a technical and editorial gate that many first-time authors fail without understanding why. The ACX review process is quiet, non-negotiable, and surprisingly easy to fail even when your audio sounds great to your own ears.
If you have ever uploaded a finished audiobook only to receive a rejection email days later, you already know how frustrating that gap between effort and approval feels. This guide explains exactly what ACX reviewers check, why most submissions fail, and how to make sure yours does not.
Who This Is For
This guide is written for indie authors, nonfiction writers, historians, novelists, and first-time audiobook creators who are preparing to submit on ACX or have already been rejected and want to understand what went wrong. If you are using an AI voice generator to produce your audiobook and want to make sure it clears ACX review without a resubmission, this is especially for you.
TL;DR
- ACX checks five core areas: audio quality, file structure, metadata accuracy, narration completeness, and the retail sample.
- The most common rejection reasons are background noise, inconsistent RMS levels, missing credits, and metadata mismatches with the Kindle version.
- ACX requires audio files to meet specific technical specs: RMS between minus 23 dB and minus 18 dB, peak levels no higher than minus 3 dB, and noise floor below minus 60 dB.
- Most first-time authors need two to three submissions before passing. Knowing the checklist in advance collapses that to one.
- Narration Box's audiobook creation platform produces ACX-compliant audio automatically with SOTA AI voices that carry emotion, pacing, and accent control out of the box.
The Most Asked Question About ACX Review
How long does ACX review take, and what exactly are they checking?
ACX review typically takes 14 business days from the date of submission. During that window, a team of reviewers checks your audio files against technical specifications, verifies that your chapter structure matches the required format, confirms that your narration matches the full manuscript, and validates that your metadata aligns with your existing Kindle or print listing on Amazon. If anything fails, your submission is rejected and the clock resets on resubmission.
What Actually Happens Inside the ACX Review Process
Most authors treat ACX review like a formality. It is not. It is a structured quality gate with real humans and automated tools checking specific parameters. Here is what they look at.
Audio Quality Verification
This is the layer where most submissions fail. ACX uses automated tools to measure your audio against a strict set of technical specifications. The required standards are:
RMS (root mean square) loudness must fall between minus 23 dB and minus 18 dB. This is the average loudness of your audio. Too quiet and your audiobook feels distant. Too loud and it becomes fatiguing to listen to.
Peak levels cannot exceed minus 3 dB. Peaks above this threshold cause clipping, which is an audible distortion that no mastering fix can clean up after the fact.
The noise floor must sit at or below minus 60 dB. This means the silence between your sentences must actually be silent. Any hum, fan noise, air conditioning, or ambient room sound that rises above this threshold is grounds for rejection.
These are not soft guidelines. ACX measures them precisely. A recording session done in a noisy home office, or an AI-generated file that has not been mastered correctly, will fail this check even if it sounds acceptable through headphones.
File Structure and Formatting Check
ACX requires a specific chapter architecture. Your audiobook cannot be one long audio file. It must be broken into separate files that follow this structure:
Opening credits come first. This file introduces the title, author name, and narrator name. It is mandatory and missing it is a common reason for rejection.
Each chapter must be its own individual audio file. You cannot combine chapters, and you cannot leave any chapter out. The files must be uploaded in order.
Closing credits come at the end. This file mirrors the opening credits and formally closes the audiobook.
A retail sample must be included separately. This is the preview that potential listeners hear before purchasing. ACX reviewers check that it is the correct length, that it represents the content faithfully, and that the audio quality matches the rest of the book.
First-time authors frequently underestimate this structural requirement. A strong recording with the wrong file architecture will still be rejected.
Metadata Verification
ACX cross-references your audiobook submission against your existing Amazon listing. If you are publishing an audiobook version of a book already on Kindle or in print, the metadata must match exactly.
This includes the book title, subtitle, author name, and series information if applicable. A difference as small as a missing comma in a subtitle, or an abbreviated author name, can flag the submission for manual review or outright rejection.
If your book is not yet on Amazon or does not have a matching Kindle edition, you need to resolve this before submitting to ACX. The platform is built as an extension of the Amazon ecosystem, and it expects your audiobook to connect cleanly to an existing product listing.
Narration Completeness Check
Reviewers verify that your narration covers the full manuscript. They are checking for:
Chapters that are complete and not cut short.
Narration that matches the manuscript without skipped passages, reordered sections, or altered text.
No missing sections, including prologues, epilogues, author notes, and acknowledgments if they appear in the source text.
This check matters significantly for nonfiction authors and historians who often have dense back matter including bibliographies, citations, and indexes. ACX does not require you to narrate a full bibliography, but any section that appears in the printed or digital book and is expected to be part of the listening experience should be present.
Retail Sample Review
The retail sample is the audio preview that Audible uses to let listeners evaluate your audiobook before buying. ACX specifies that your retail sample should be between one and five minutes long and must come from the body of the book, not from your credits or introduction.
Reviewers check the audio quality of this file independently, even if the rest of your submission passes. A retail sample with noise, uneven levels, or pacing problems will delay or block your approval even if your full audiobook files are technically compliant.
Common Reasons Audiobooks Fail ACX Review
The data from author communities and ACX forums is consistent. The most frequent causes of rejection are:
Background noise that exceeds the minus 60 dB noise floor threshold. This is the single most common failure point for home recordings.
Inconsistent audio levels across chapters. Authors who record over multiple sessions in different room conditions often end up with chapter-to-chapter volume variation that fails the RMS check.
Missing opening or closing credits. Straightforward to add, but easy to forget.
Metadata mismatch between the audiobook submission and the Amazon listing. This is especially common when an author updates their book title or subtitle after the initial listing was created.
Incorrect chapter structure, usually from uploading one combined file instead of individual chapter files.
Many first-time authors submit two or three times before passing. Each resubmission resets the 14-business-day review clock. For an author with a launch date in mind, that delay compounds quickly.
Why Most DIY Solutions Create These Problems
Home recording setups introduce acoustic variables that are difficult to control consistently. Professional studio time is expensive and often inaccessible for indie authors. Freelance narrators on ACX can cost between 200 and 400 dollars per finished hour, and for a full-length nonfiction book, that cost grows fast.
The deeper problem is that even after paying for narration, the technical compliance burden still falls on the author. You are still responsible for editing, mastering, formatting, and uploading correctly structured files. The narrator records. You still do everything else.
AI voice generation solves part of this problem but introduces its own challenges. Most AI text-to-speech tools produce flat, emotionally neutral audio. A biography narrated in a monotone voice may pass ACX technical review but will receive poor listener ratings, and in the long run, poor ratings damage discoverability more than a slow review process does.
The solution is not just technically compliant audio. It is emotionally compelling audio that also happens to meet every ACX specification.
How Narration Box Solves This for Audiobook Authors
Narration Box has built a dedicated audiobook creation product designed around the specific needs of authors. It converts documents in EPUB, PDF, DOC, DOCX, and other formats directly into finished audiobook audio. Here is how it works and why it matters for ACX compliance.
What the Narration Box Audiobook Product Actually Does
You upload your manuscript in whatever format you have it. Narration Box accepts EPUB, PDF, Word documents, and most common text formats.
The platform selects an AI narrator from its library of 700 plus voices. You can also choose one yourself based on tone, genre fit, and language requirements.
The AI voice automatically detects the emotional context of the text and narrates with appropriate expression. A tense chapter reads tense. A reflective passage slows into something more meditative. This happens without manual adjustment.
If you want more specific control, you have two options. You can insert inline emotion tags directly into your manuscript text using square brackets. For example: "He opened the door slowly. [whispered] I did not expect this." The AI voice reads the tag as a direction and delivers that line in a whisper. Or you can use style prompts to shape the entire narration. You can instruct the narrator to speak with a British accent, to use a slower and more deliberate pace, or to adopt a conversational tone throughout.
The platform also handles language detection automatically. If you upload a manuscript written in German, the AI narrator will speak in German with a natural German accent. If you want a Canadian accent narrating a French text, you can prompt that directly and the narrator will deliver it.
Audio output from Narration Box is built to meet ACX technical specifications. The platform handles mastering so that your files land within the correct RMS range, with appropriate peak control and a clean noise floor.
Enbee V2 Voices for Audiobook Narration
The Enbee V2 voices are the most capable narrators available on the Narration Box platform. They are state-of-the-art AI voices trained to understand context, shift register between sections, and deliver genuine emotional variability without sounding mechanical.
For nonfiction authors, Etta is a strong choice. Her voice carries the kind of measured authority that works well for history, memoir, research-driven narrative, and biography. She does not over-emote, but she does not read flatly either. She adjusts pacing and weight based on what the text is communicating.
For historians writing narrative history, Harlan brings a literary quality that holds a reader's attention across long arcs. His pacing is deliberate and his register has weight without feeling heavy.
For novelists working in genre fiction or epic storytelling, Lenora handles the range that fantasy and science fiction demand. Her voice shifts between expository narration and character-driven dialogue without losing coherence.
For thriller and suspense authors, Harvey creates the tension that fast-paced narrative requires. His voice is economical and drives forward momentum.
Every Enbee V2 voice is multilingual. The full language list includes English, French, Spanish, Portuguese, Arabic, Mandarin, German, Hindi, Japanese, Urdu, Punjabi, Swahili, and more than 50 additional languages and regional dialects. You do not need a separate narrator for each language edition of your book. The same voice handles all of them with native-level accent and emotional range.
Style prompting works as a real-time instruction layer. You tell the voice how to speak, and it listens. Tell it to adopt a slower pace for a difficult chapter. Tell it to shift into a warmer register for a personal section. Tell it to narrate with a regional accent. It adjusts immediately and holds that adjustment across the section.
Inline emotion tags work at the sentence and phrase level. This is particularly useful for nonfiction authors who want a dramatic read of a key historical moment, or novelists who need a character's internal monologue to sound genuinely different from the surrounding narration.
How to Make Your Audiobook with Narration Box for ACX Submission
This is not a generic walkthrough. These are the specific steps that matter for getting through ACX review.
Step 1: Prepare Your Manuscript
Before you upload anything, clean your manuscript. Remove formatting artifacts, fix inconsistent hyphenation, and make sure every section that should be narrated is included and in the correct order. If your book has an author note, a foreword by someone else, or a lengthy bibliography, decide now whether you are including those sections and flag them accordingly.
Add your inline emotion tags at this stage if you want granular emotional control. Mark the passages where the tone should shift, where a character speaks, or where a moment of weight needs to land differently.
Step 2: Upload Your Document to Narration Box
Go to your Narration Box studio and upload the manuscript file. The platform accepts EPUB, PDF, DOCX, and other standard formats. The document import tool parses your text and prepares it for narration.
Select your Enbee V2 narrator. For most nonfiction and historical writing, Etta or Harlan will give you the appropriate register. For fiction, Lenora, Harvey, or Lorraine depending on genre.
Enter your style prompt in the prompt field. Be specific. "Narrate in a calm, measured tone with deliberate pacing suitable for a history book" is more useful than "speak slowly." The more context you give the voice, the better it calibrates to your intent.
Step 3: Generate and Review the Narration
Let the platform generate your audio. Listen chapter by chapter, not as one continuous file. Pay attention to transitions between chapters. Check that emotional tags are landing correctly. Verify that the pacing does not drag or rush in sections where the text changes register.
If something feels off, go back to your style prompt and adjust the instruction. You can also insert additional inline tags at specific points. This is iterative, and two or three passes is normal even for experienced users.
Step 4: Export in ACX-Compliant Format
Narration Box exports audio in formats that meet ACX specifications. When exporting, confirm that your files are structured as individual chapter files, not one combined export. Export your opening credits, each chapter, and your closing credits as separate files.
Check your retail sample at this stage. It should be between one and five minutes, taken from the body of your book, not from the credits. Export it as a separate file.
Step 5: Run a Technical Check Before Uploading
Before you submit to ACX, run your audio files through a free mastering check tool such as Auphonic or the built-in loudness meter in Audacity. Confirm that your RMS levels fall between minus 23 dB and minus 18 dB, that no peaks exceed minus 3 dB, and that your noise floor reads below minus 60 dB.
If any file falls outside these ranges, adjust in Narration Box by modifying your generation settings or applying light mastering in Audacity before the final export.
Step 6: Test With a Fresh Listener
Before submission, play your audiobook for someone who has not read the book. Not a beta reader who knows the story. Someone completely unfamiliar with the content.
Ask them: Does the narrator feel like they understand the material? Do emotional moments land? Is there a point where the voice felt flat or disconnected from the text? Does the opening chapter make them want to hear more?
This test is more valuable than any technical checklist because it tells you whether the narration is doing its actual job, which is to hold a listener's attention for hours.
ACX Compliance Checklist for AI-Generated Audiobooks
Use this before every submission.
Audio technical specs: RMS between minus 23 dB and minus 18 dB, peaks below minus 3 dB, noise floor at or below minus 60 dB.
File structure: Opening credits file, individual chapter files in order, closing credits file, retail sample file.
Metadata: Title, subtitle, author name, and series details match your existing Amazon listing exactly.
Narration completeness: Every chapter and required section is present and matches the manuscript.
Retail sample: Between one and five minutes, from the body of the book, clean audio quality.
Voice rights: When using AI-generated audio, confirm that your usage license from the platform covers commercial audiobook distribution. Narration Box licenses cover commercial use. Always confirm this for any platform you use.
How to Earn Money from Your Audiobook and Build Reviews
Revenue Structure on ACX
ACX offers two royalty models. The exclusive distribution model through Audible and Amazon pays 40 percent royalties. The non-exclusive model, which allows you to distribute through other platforms alongside Audible, pays 25 percent.
Exclusive authors also have access to ACX bounties, which are one-time payments triggered when a new Audible member purchases your audiobook as their first listen. Bounty amounts have historically ranged from 50 to 75 dollars per qualifying sale, though these figures are subject to change. For authors in genres with strong Audible audiences, such as thriller, fantasy, business, and self-help, the bounty program can add meaningfully to early-launch revenue.
Revenue Beyond ACX
Authors who choose non-exclusive distribution can submit their audiobook to Findaway Voices, which distributes to more than 40 platforms including Apple Books, Google Play Books, Kobo, Scribd, and library platforms like OverDrive and hoopla. Findaway Voices pays 80 percent royalties on net receipts. For authors with an existing audience across platforms, the wide distribution strategy often outperforms Audible-exclusive over a 12 to 18 month window.
Author's Republic is another wide distributor that serves authors who want to expand reach without managing platform relationships individually.
Getting Your First 50 Reviews
Audiobook reviews on Audible are notoriously difficult to accumulate. Listeners do not review by default. The most effective strategies that authors use are:
Sending Audible review codes to your existing email list or reader community. ACX provides authors with a set of promo codes on approval. Use them strategically.
Reaching out to audiobook-specific review communities. Audiobook Boom, AudiobookReviewer.com, and r/audiobooks on Reddit are three active spaces where listeners actively seek new titles to review.
Including a brief verbal call to action in your closing credits. Keep it under 20 seconds. Thank the listener and ask them to leave a review if the audiobook resonated with them.
Reaching out to nonfiction bloggers, history podcasters, or genre-specific communities that align with your book's subject. A review from an established voice in your niche carries more weight than volume alone.
What Makes an Audiobook More Discoverable on Audible
Discovery on Audible is driven by a combination of review volume, review quality, completion rate (the percentage of listeners who finish the audiobook), and whispersync compatibility for books that also have Kindle editions.
Emotional narration directly affects completion rate. Listeners who feel engaged with a narrator continue. Listeners who feel like they are hearing a machine read text to them stop. This is why the quality of narration matters not just for ACX approval but for long-term commercial performance.
Authors who invest in emotionally calibrated narration, whether through a skilled human narrator or through an AI voice capable of genuine tonal variation, consistently see better completion rates and, consequently, better organic ranking on the Audible platform.
A Quick Note on Voice Cloning for Audiobook Authors
Some authors want their audiobook narrated in their own voice without recording it themselves. Narration Box offers voice cloning that captures your vocal signature from a short sample and applies it to your full manuscript. This produces a narration that sounds like you without requiring you to sit in a recording booth for 40 plus hours.
For nonfiction authors especially, a self-narrated audiobook builds personal brand and listener trust in a way that a third-party narrator does not. The voice cloning route makes this practical at scale.
If you use voice cloning for ACX distribution, ensure that you are using your own voice sample and that the final audio meets all technical specifications. ACX does not differentiate between human and AI narration in its review process. The same standards apply.
Path Forward
If you have a manuscript and you want it on Audible, the path is:
Clean and structure your manuscript. Decide on your narration approach. Use Narration Box to generate emotionally calibrated, ACX-compliant audio. Build your chapter file structure correctly from the start. Run a technical check before uploading. Test with a cold listener. Submit with confidence.
The authors who fail ACX review multiple times are not failing because their books are poor. They are failing because they did not know what the review process actually checks. Now you do.
Frequently Asked Questions
Do authors make money from audiobooks?
Yes, audiobooks represent a growing revenue stream for authors across genres. Audible alone accounts for a significant portion of the global audiobook market, and authors on ACX earn between 25 and 40 percent royalties depending on their distribution model. Nonfiction authors in business, history, self-help, and memoir often find that their audiobook earns comparably to their ebook over a 12-month period, particularly if they build a listener audience early.
Is selling 3,000 copies of a book good?
For an indie author, 3,000 copies sold is a meaningful milestone. It places you above the majority of self-published titles, which sell fewer than 100 copies in their first year. For an audiobook specifically, 3,000 completed listens on Audible generates enough review momentum to improve organic discoverability. Whether that translates into strong revenue depends on your royalty model, your price point, and whether you have additional titles in the same genre or category.
What is the revenue of audiobooks?
The global audiobook market was valued at approximately 6.8 billion dollars in 2023 and is projected to grow at a compound annual rate of around 26 percent through 2030. The United States is the largest market by revenue. Audible dominates digital audiobook retail, but platforms like Spotify, Apple Books, and Scribd are expanding their audiobook catalogs and creating additional revenue channels for authors.
Is publishing audiobooks profitable?
It depends on your production costs and your distribution strategy. An author who uses AI narration through Narration Box to produce a full-length audiobook at a fraction of the cost of a studio recording session and a professional narrator changes the economics significantly. With production costs lowered and royalties between 25 and 80 percent depending on the platform, the break-even threshold drops substantially. Authors with an existing readership, a strong nonfiction platform, or a series in a popular genre consistently report that audiobooks add meaningful revenue with relatively low ongoing cost after the initial production.
Try Narration Box for Your Audiobook
If you are ready to move from manuscript to finished, ACX-compliant audiobook without a recording booth or a 300 dollar per hour narrator, Narration Box is built for exactly that.
Upload your EPUB, PDF, or Word document. Choose your Enbee V2 narrator. Set your style prompt. Export your files in ACX-ready format.
