Modern Package Development: How I Built & Shipped a DX-First Library
I was building an AI chatbot called Blockpedia and wanted to add a speech-to-text feature. You know, like how ChatGPT, Gemini, and Claude have that little mic icon that lets you talk instead of type.
I didn't want to pay for Whisper API calls or deal with complex audio processing setups. Browsers already have the Web Speech API built-in - free, instant, and good enough for most use cases.
So I searched for React packages. Found a few. Used them. And quickly realized they all had the same problem: they just wrap the API without solving the actual hard parts.
The "Hard Parts" Nobody Handles
- ✗Mic permissions - Users click the button, nothing happens, they're confused. Where's the permission prompt? Was it denied before?
- ✗Silence detection - User stops talking but recording continues forever. Or it cuts off mid-sentence.
- ✗Cursor position - Text always appends to the end. What if I want to insert it where my cursor is?
- ✗Browser quirks - Works in Chrome, breaks in Safari. No webkit prefixing.
So I did what any developer does when they can't find the right tool: I built it myself and published it on npm.
This post is the story of how I built @syntropy-labs/react-web-speech, but more importantly, it's a guide to modern npm package development in 2026 - the tools, the patterns, and the mindset.
What I Built
Before diving into how to build packages, let me show you what the end result looks like. The entire API is two hooks:
import { useSpeechInput } from '@syntropy-labs/react-web-speech'
function VoiceInput() {
const { transcript, isListening, toggle, permissionState } = useSpeechInput({
silenceTimeout: 2000, // Auto-stop after 2s silence
continuous: false, // Stop after one phrase
})
return (
<div>
<button onClick={toggle}>
{isListening ? '🔴 Listening...' : '🎤 Click to speak'}
</button>
<p>Permission: {permissionState}</p> {/* 'prompt' | 'granted' | 'denied' */}
<p>You said: {transcript}</p>
</div>
)
}And for the cursor-aware insertion that existing packages don't handle:
import { useSpeechInputWithCursor } from '@syntropy-labs/react-web-speech'
function SmartTextarea() {
const inputRef = useRef<HTMLTextAreaElement>(null)
const [value, setValue] = useState('')
const { isListening, toggle } = useSpeechInputWithCursor({
inputRef,
value,
onChange: setValue, // Text inserts at cursor position
})
return (
<div>
<textarea ref={inputRef} value={value} onChange={e => setValue(e.target.value)} />
<button onClick={toggle}>{isListening ? 'Stop' : 'Speak'}</button>
</div>
)
}Know exactly if permission is prompt/granted/denied before the user clicks anything
Configurable timeout that actually works - stops cleanly when user stops talking
Text goes exactly where the cursor is, not just appended to the end
Handles webkit prefixing for Safari, graceful degradation for Firefox
~5KB gzipped. Fully typed. Tree-shakeable. Zero dependencies beyond React.
Try It Yourself
Don't take my word for it - here's the actual package running on this page. Click the mic and say something.
Browser Support (your browser is highlighted):
Demo 1: Basic Speech Input
The useSpeechInput hook handles everything - permissionStates, listening state, transcript, and automatic silence detection.
Web Speech API not supported in this browser
Try opening this page in Chrome, Edge, or Safari
Demo 2: Cursor-Aware Text Insertion
This is what other packages don't do. Position your cursor anywhere in the text below, then speak - your words will be inserted exactly where the cursor is.
Web Speech API not supported in this browser
Try opening this page in Chrome, Edge, or Safari
What's happening under the hood?
- →Permission tracking - The hook checks mic permission state before you even click, so you can show appropriate UI
- →Silence detection - Recording stops automatically 2 seconds after you stop talking (configurable)
- →Cursor position - The hook captures and restores the selection range, inserting text at the exact cursor position
- →Browser compatibility - Handles webkit prefixing for Safari automatically
The Modern Package Toolchain (2026)
The JavaScript ecosystem moves fast. Here's what I used and why - these are the tools that will save you hours of configuration hell.
tsdown
Build Tooltsdown is the spiritual successor to tsup, powered by Rolldown (the Rust bundler). It's what you should be using in 2026.
- 10x faster builds (Rust-based)
- Native ESM + CJS dual output
- Auto-generates .d.ts files
- Zero config for most cases
- Same API surface
- Drop-in replacement
- Same config file format
- Just change the package name
import { defineConfig } from 'tsdown'
export default defineConfig({
entry: ['./src/index.ts'],
format: ['esm', 'cjs'], // Dual package support
dts: true, // Generate TypeScript declarations
clean: true, // Clean dist/ before build
treeshake: true, // Remove unused code
external: ['react', 'react-dom'], // Don't bundle peer deps
})Vitest + Testing Library
TestingJest is fine, but Vitest is faster and has native TypeScript support. For React hooks, pair it with @testing-library/react.
import { renderHook, act } from '@testing-library/react'
import { describe, it, expect, vi } from 'vitest'
import { useSpeechInput } from '../hooks/useSpeechInput'
describe('useSpeechInput', () => {
it('should initialize with correct defaults', () => {
const { result } = renderHook(() => useSpeechInput())
expect(result.current.isListening).toBe(false)
expect(result.current.transcript).toBe('')
expect(result.current.isSupported).toBe(true)
})
it('should toggle listening state', () => {
const { result } = renderHook(() => useSpeechInput())
act(() => result.current.toggle())
expect(result.current.isListening).toBe(true)
})
})Changesets
VersioningForget manually bumping versions. Changesets handles semver, changelogs, and npm publishing in one workflow.
# Add a changeset for your changes
npx changeset
# When ready to release, version all packages
npx changeset version
# Publish to npm
npx changeset publishIt asks you what kind of change (patch/minor/major) and generates a changelog entry. No more "what version should this be?" debates.
Husky + lint-staged
Git HooksAutomatically lint and format code before every commit. Never push broken code again.
{
"lint-staged": {
"*.{js,ts,tsx}": ["eslint --fix", "prettier --write"]
},
"scripts": {
"prepare": "husky"
}
}npx lint-staged2020 vs 2026 Tooling
| Purpose | 2020 | 2026 |
|---|---|---|
| Bundler | Rollup / Webpack | tsdown (Rust) |
| Testing | Jest | Vitest |
| Type Generation | tsc (slow) | tsdown dts (fast) |
| Versioning | Manual + npm version | Changesets |
| Linting | ESLint (slow) | ESLint + Oxlint (fast) |
Project Structure
Here's the structure I used. It's pretty standard, but the key is keeping it simple:
react-web-speech/
├── src/
│ ├── index.ts # Public exports only
│ ├── hooks/
│ │ ├── useSpeechInput.ts
│ │ └── useSpeechInputWithCursor.ts
│ ├── utils/
│ │ └── cursor.ts # Cursor manipulation utilities
│ └── __tests__/
│ └── useSpeechInput.test.ts
├── dist/ # Generated by tsdown
│ ├── index.mjs # ESM build
│ ├── index.cjs # CommonJS build
│ └── index.d.ts # Type declarations
├── tsdown.config.ts
├── tsconfig.json
├── package.json
└── README.mdKey Principle: Explicit Public API
Your src/index.ts should only export what users need. Don't export internal utilities or types unless they're actually useful to consumers.
// Only export what users need
export { useSpeechInput } from './hooks/useSpeechInput'
export { useSpeechInputWithCursor } from './hooks/useSpeechInputWithCursor'
// Export types for TypeScript users
export type {
SpeechInputOptions,
SpeechInputResult,
PermissionState
} from './types'
// Internal utilities stay internal
// DON'T export: insertTextAtCursor, getSelectionRange, etc.The package.json Anatomy
This is where most people mess up. Here's every field that matters for a modern npm package:
{
"name": "@syntropy-labs/react-web-speech",
"version": "0.1.2",
"description": "React hooks for Web Speech API with mic permissionStates, listening states, and cursor-aware text insertion",
"type": "module",
"main": "./dist/index.cjs",
"module": "./dist/index.mjs",
"types": "./dist/index.d.cts",
"exports": {
".": {
"import": {
"types": "./dist/index.d.mts",
"default": "./dist/index.mjs"
},
"require": {
"types": "./dist/index.d.cts",
"default": "./dist/index.cjs"
}
}
},
"files": ["dist"],
"sideEffects": false,
"peerDependencies": {
"react": ">=17.0.0",
"react-dom": ">=17.0.0"
},
"peerDependenciesMeta": {
"react-dom": { "optional": true }
},
"scripts": {
"build": "tsdown",
"dev": "tsdown --watch",
"test": "vitest",
"lint": "eslint src/",
"lint:fix": "eslint src/ --fix",
"release": "changeset publish"
},
"keywords": ["react", "speech", "voice", "web-speech-api", "hooks"],
"license": "MIT",
"repository": "https://github.com/syntropyLabs/react-web-speech"
}"type": "module"
Tells Node.js this is an ES module package. Required for modern tooling. Without it, you'll get import/export errors.
"exports" (Conditional exports)
The modern way to define entry points. Supports both ESM (import) and CJS (require) with proper type declarations for each. This is what makes your package work everywhere.
"sideEffects": false
Tells bundlers your package has no side effects, enabling aggressive tree-shaking. If users only import one hook, they shouldn't get the other in their bundle.
"peerDependencies"
React packages should NEVER bundle React. Instead, list it as a peer dependency so users' existing React version is used. This prevents the dreaded "multiple React instances" error.
"files": ["dist"]
Only publish the dist folder to npm. Don't publish src/, tests, or config files. Keeps your package size small.
The DX-First Mindset
DX (Developer Experience) isn't just about having good docs. It's about designing your API so developers fall into the pit of success. Here's how I thought about it:
1. Sensible Defaults
The hook should work with zero configuration. Only require options when there's no obvious default.
useSpeech({ lang: 'en-US', continuous: false, ... })useSpeechInput() // Just works2. Discoverable API
Return objects, not arrays. Let TypeScript autocomplete guide developers.
const [text, listening, toggle] = useSpeech()const { transcript, isListening } = useSpeechInput()3. Handle Edge Cases Internally
Don't force users to handle browser quirks. Do it inside your package.
// Inside the hook - users never see this
const SpeechRecognition =
window.SpeechRecognition ||
window.webkitSpeechRecognition // Safari support
const isSupported = typeof SpeechRecognition !== 'undefined'
// Expose a simple boolean to users
return { isSupported, ... }4. Fail Gracefully
If the browser doesn't support the feature, don't crash. Return a sensible state.
// If Web Speech API isn't supported
if (!isSupported) {
return {
isListening: false,
transcript: '',
isSupported: false,
permissionState: 'denied' as const,
start: () => {}, // No-op
stop: () => {}, // No-op
toggle: () => {}, // No-op
}
}Publishing to npm
Once your package is ready, publishing is straightforward:
# Login to npm (create account at npmjs.com if needed)
npm login
# For scoped packages (@your-name/package)
npm login --scope=@your-scope
# Initialize changesets (one time)
npx changeset init# 1. Make your changes and commit them
# 2. Create a changeset describing the change
npx changeset
# > What kind of change? (patch/minor/major)
# > Summary: "Add silence detection timeout option"
# 3. When ready to release, version the package
npx changeset version
# This bumps version in package.json and updates CHANGELOG.md
# 4. Build and publish
npm run build
npx changeset publish
# or just: npm run release (if you have it in scripts)Pro Tip: Provenance
Add provenance to your publishes for supply chain security. This proves your package was built from your GitHub repo:
{
"publishConfig": {
"provenance": true
}
}Requires publishing from GitHub Actions. npm will show a "Provenance" badge on your package page.
Key Takeaways
- 1Build what you actually need. The best packages come from solving your own problems. If you can't find a good solution, others probably can't either.
- 2Use modern tooling. tsdown, Vitest, and Changesets will save you hours. The Rust-based tools are genuinely faster.
- 3Obsess over DX. Sensible defaults, discoverable APIs, graceful degradation. Make it impossible for users to use your package wrong.
- 4Handle the hard parts. Browser quirks, edge cases, permissions - don't just wrap an API, solve the actual problems developers face.
- 5Ship it. Your first version doesn't need to be perfect. v0.1.0 is better than an unpublished masterpiece.
That's how I built and shipped my first npm package. If you're building something and can't find the right tool - maybe it's time to build it yourself.
Resources:
• tsdown - github.com/rolldown/tsdown
• Changesets - github.com/changesets/changesets
• Vitest - vitest.dev
• Web Speech API MDN - developer.mozilla.org


