EchoLocate/INSTALL.txt at main · mgifford/EchoLocate · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
EchoLocate — Installation Guide
=================================

-----------------------------------------------------------------------
OPTION 1 — Use the live demo (nothing to install)
-----------------------------------------------------------------------

Visit:  https://mgifford.github.io/EchoLocate/

Works in Chrome or Edge on desktop.  All audio stays in your browser;
nothing is sent to any server.


-----------------------------------------------------------------------
OPTION 2 — Run locally (for offline use or extra privacy)
-----------------------------------------------------------------------

Requirements
  • Git
  • Python 3.7 or later  (pre-installed on macOS and most Linux distros)
  • Chrome or Microsoft Edge  (needed for the Web Speech API)

Steps

  1. Clone the repository

        git clone https://github.com/mgifford/EchoLocate.git
        cd EchoLocate

     The vendor/ directory is committed to the repository, so all
     JavaScript dependencies (HTMX, Meyda, franc-min) are already
     present — no npm install or build step needed.

  2. Start the local server

        python3 server.py

     You should see:

        EchoLocate is running at:
          http://localhost:8080/

  3. Open Chrome or Edge and navigate to

        http://localhost:8080/

  4. Allow microphone access when prompted by the browser.

  5. Click Start to begin live captioning.

The app works fully offline once loaded; no internet connection is
needed for any captioning or analysis.  Close the terminal window
(or press Ctrl+C) to stop the server.

Custom port:

    python3 server.py 9000       # changes port to 9000


-----------------------------------------------------------------------
OPTION 3 — On-device language-detection model (optional, ~25 MB)
-----------------------------------------------------------------------

The base app uses n-gram heuristics (franc-min) for language detection.
For higher accuracy, you can download a neural language-id model that
runs entirely in the browser via WebAssembly.

  1. Ensure you are connected to the internet for the download step.

  2. From the EchoLocate directory, run:

        chmod +x download-models.sh
        ./download-models.sh

     This downloads:
       • Transformers.js runtime        vendor/transformers/    ~800 KB
       • ONNX Runtime WASM files        vendor/onnx-runtime/   ~10 MB
       • language-id model              models/language-id/     ~3 MB

     Total: approximately 14 MB.

  3. Restart or reload the server:

        python3 server.py

  4. EchoLocate will detect the model files at startup and use them
     automatically.  No configuration is required.

The model identifies 97 languages directly from recognised text in
roughly 40 ms per card.  After this one-time download the app runs
with no internet connection required.


-----------------------------------------------------------------------
UPDATING DEPENDENCIES
-----------------------------------------------------------------------

If you want to refresh all vendored JavaScript to the latest pinned
versions:

    chmod +x download-deps.sh
    ./download-deps.sh

This re-downloads HTMX, Meyda, and franc-min from their CDN sources
into vendor/.  The pinned version numbers are at the top of that
script; edit them to upgrade.


-----------------------------------------------------------------------
TROUBLESHOOTING
-----------------------------------------------------------------------

"Microphone access denied"
  Grant microphone permission in the browser's address-bar lock icon.
  On macOS, System Settings → Privacy & Security → Microphone.

"Speech recognition not available"
  Use Chrome or Microsoft Edge.  Firefox and Safari do not currently
  implement the Web Speech API.

"Service worker errors in DevTools"
  Clear the site data in DevTools → Application → Storage → Clear, then
  hard-reload (Cmd+Shift+R / Ctrl+Shift+R).

"Port 8080 already in use"
  Pass a different port: python3 server.py 9090


-----------------------------------------------------------------------
PRIVACY
-----------------------------------------------------------------------

EchoLocate processes all audio on your device.  No transcript text,
audio, speaker data, or session content is ever transmitted to any
server.  See index.html for the full privacy notice displayed in the app.


-----------------------------------------------------------------------
LICENSE
-----------------------------------------------------------------------

MIT — see LICENSE for details.