Contents

GSoC Weekly Update: Week 3


This post will briefly cover:

  • Learnings
  • Tasks done, and those in progress
  • Helpful resources

For the project proposal, visit here.


This post is in continuation with the previous one, which had been updated as well.


Gathering the Vosk API from GitHub:

Used devtool to get the Vosk offline speech recognition API from GitHub:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
$ devtool add python3-vosk --src-subdir python --srcrev b1b216d4c87d708935f1601287fe502aa11ee4a9 --version 0.3.42 --srcbranch master https://github.com/alphacep/vosk-api

# Inside the workdir as mentioned in the output of devtool add, found the following recipe:
========================================================================================
# Recipe created by recipetool
# This is the basis of a recipe and may need further editing in order to be fully functional.
# (Feel free to remove these comments when editing.)

SUMMARY = "Offline open source speech recognition API based on Kaldi and Vosk"
HOMEPAGE = "https://github.com/alphacep/vosk-api"
# WARNING: the following LICENSE and LIC_FILES_CHKSUM values are best guesses - it is
# your responsibility to verify that the values are complete and correct.
#
# The following license files were not able to be identified and are
# represented as "Unknown" below, you will need to check them yourself:
#   .eggs/tqdm-4.64.0-py3.8.egg/EGG-INFO/LICENCE
# NOTE: Original package / source metadata indicates license is: Apache
#
# NOTE: multiple licenses have been detected; they have been separated with &
# in the LICENSE value for now since it is a reasonable assumption that all
# of the licenses apply. If instead there is a choice between the multiple
# licenses then you should change the value to separate the licenses with |
# instead of &. If there is any doubt, check the accompanying documentation
# to determine which situation is applicable.
LICENSE = "MIT & Unknown & Apache"
LIC_FILES_CHKSUM = "file://.eggs/srt-3.5.2-py3.8.egg/EGG-INFO/LICENSE;md5=6658a1272b4469f7249985d28b8697bb \
                    file://.eggs/tqdm-4.64.0-py3.8.egg/EGG-INFO/LICENCE;md5=1672e2674934fd93a31c09cf17f34100"

SRC_URI = "git://github.com/alphacep/vosk-api;protocol=https;branch=master"

# Modify these as desired
PV = "0.3.42+git${SRCPV}"
SRCREV = "b1b216d4c87d708935f1601287fe502aa11ee4a9"

S = "${WORKDIR}/git/python"

inherit setuptools3

# WARNING: the following rdepends are determined through basic analysis of the
# python sources, and might not be 100% accurate.
RDEPENDS_${PN} += "python3-cffi python3-compression python3-core python3-datetime python3-json python3-logging python3-misc python3-multiprocessing python3-netclient python3-requests python3-tqdm python3-srt"
========================================================================================

Special thanks to Tim Orling (moto-timo) for his assistance and pointing out the errors in my previous method of writing and building the vosk recipe (not using the the wheel package from pypi but using the actual GitHub repo instead).

Gathering the library ‘srt’ tarball

Again, used devtool to obtain the srt library, which is a build-time dependency for Vosk API.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ devtool add python3-srt https://files.pythonhosted.org/packages/18/a3/e1466f7c86a9e5d3e462ed6eb3a548917e93cc1ee212cd927f8f4e887ae9/srt-3.5.2.tar.gz

# Inside the workdir as mentioned in the output of devtool add, found the following recipe:
=======================================================================================

# Recipe created by recipetool
# This is the basis of a recipe and may need further editing in order to be fully functional.
# (Feel free to remove these comments when editing.)

SUMMARY = "A tiny library for parsing, modifying, and composing SRT files."
HOMEPAGE = "https://github.com/cdown/srt"
# WARNING: the following LICENSE and LIC_FILES_CHKSUM values are best guesses - it is
# your responsibility to verify that the values are complete and correct.
LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://LICENSE;md5=6658a1272b4469f7249985d28b8697bb"

SRC_URI = "https://files.pythonhosted.org/packages/18/a3/e1466f7c86a9e5d3e462ed6eb3a548917e93cc1ee212cd927f8f4e887ae9/srt-${PV}.tar.gz"
SRC_URI[md5sum] = "3b68be7c46ec6152123fd801f519a63d"
SRC_URI[sha1sum] = "902e36e37f02e62488439bf86066dfa3c4b2b672"
SRC_URI[sha256sum] = "7aa4ad5ce4126d3f53b3e7bc4edaa86653d0378bf1c0b1ab8c59f5ab41384450"
SRC_URI[sha384sum] = "c68a0de85c3ad5a8026c15c1b750281479c4f264fa6f9767f93f2001f414cac52e2ca502f8ae13d6d885101fa95c32ac"
SRC_URI[sha512sum] = "5367d7fa3ed23523f03efad1524fcb44c1a8e1c95e2f3032c0e11ff67795a1399eb32b27365e4b0f98ed5b1d7671d576ab8cd342d50bb4005554faaf03ea9c8a"

S = "${WORKDIR}/srt-${PV}"

inherit setuptools3

# WARNING: the following rdepends are determined through basic analysis of the
# python sources, and might not be 100% accurate.
RDEPENDS_${PN} += "python3-core python3-datetime python3-logging"
=======================================================================================

There were a few tweaks that were necessary in order for both the recipes to be built. The Vosk API recipe needs few build-time dependencies, which are python3-srt-native, python3-tqdm-native, python3-requests-native.


The recipes python3-tqdm-native and python3-requests-native already exist in the OpenEmbedded Layer Index. But python3-srt-native doesn’t. Hence tweaked the python3-srt recipe to include BBCLASSEXTEND variable to extend out srt recipe to support the build of the Vosk API recipe.


The final python3-srt_3.5.2.bb recipe looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# python3-srt

SUMMARY = "A tiny library for parsing, modifying, and composing SRT files."
HOMEPAGE = "https://github.com/cdown/srt"
AUTHOR = "aman.arora9848@gmail.com"

LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://LICENSE;md5=6658a1272b4469f7249985d28b8697bb"

SRC_URI = "https://files.pythonhosted.org/packages/18/a3/e1466f7c86a9e5d3e462ed6eb3a548917e93cc1ee212cd927f8f4e887ae9/srt-${PV}.tar.gz"
SRC_URI[md5sum] = "3b68be7c46ec6152123fd801f519a63d"
SRC_URI[sha256sum] = "7aa4ad5ce4126d3f53b3e7bc4edaa86653d0378bf1c0b1ab8c59f5ab41384450"

S = "${WORKDIR}/srt-${PV}"

inherit setuptools3

RDEPENDS_${PN} += "python3-core python3-datetime python3-logging"

BBCLASSEXTEND = "native nativesdk"

Finalizing

After adding all the required dependencies as stated in the build errors for the recipe python3-vosk_0.3.42.bb, the final recipe looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# python3-vosk

SUMMARY = "Offline open source speech recognition API based on Kaldi and Vosk"
HOMEPAGE = "https://github.com/alphacep/vosk-api"
AUTHOR = "aman.arora9848@gmail.com"

LICENSE = "Apache-2.0"
LIC_FILES_CHKSUM = "file://../COPYING;md5=d09bbd7a3746b6052fbd78b26a87396b"

SRC_URI = "git://github.com/alphacep/vosk-api;protocol=https;branch=master"

PV = "0.3.42+git${SRCPV}"
SRCREV = "b1b216d4c87d708935f1601287fe502aa11ee4a9"

S = "${WORKDIR}/git/python"

inherit setuptools3

DEPENDS += " \
    python3-srt-native \
    python3-tqdm-native \
    python3-requests-native \
    "

RDEPENDS_${PN} += " \
    python3-cffi \
    python3-compression \
    python3-core \
    python3-datetime \
    python3-json \
    python3-logging \
    python3-misc \
    python3-multiprocessing \
    python3-netclient \
    python3-requests \
    python3-tqdm \
    python3-srt \
    "

The recipe builds with no errors:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
aman@Debian-1011-buster-64-minimal:~/AGL/marlin/qemux86-64$ bitbake python3-vosk
Loading cache: 100% |###############################################################################################################################################################################| Time: 0:00:00
Loaded 4027 entries from dependency cache.                                                               
Parsing recipes: 100% |#############################################################################################################################################################################| Time: 0:00:03
Parsing of 2661 .bb files complete (2641 cached, 20 parsed). 4046 targets, 269 skipped, 2 masked, 0 errors.                                                                                                        
NOTE: Resolving any missing task queue dependencies                                               
                                                                                                                                                                                                                   
Build Configuration:                                                                                                                                                                                              
BB_VERSION           = "1.46.0"                    
BUILD_SYS            = "x86_64-linux"
NATIVELSBSTRING      = "universal"
TARGET_SYS           = "x86_64-agl-linux"
MACHINE              = "qemux86-64"  
DISTRO               = "poky-agl" 
DISTRO_VERSION       = "13.0.1+snapshot-20220705"
TUNE_FEATURES        = "m64 corei7"
TARGET_FPU           = ""        
meta-offline-voice-agent = "HEAD:d4322f2bee9d4ed9cfaf13e78934e5a9e03e3918"                 
meta-pipewire                                                                                                    
meta-app-framework   = "HEAD:d8761cb048b6c008b6e9ce53038b8a3a9986e84f"               
meta-python2         = "HEAD:b901080cf57d9a7f5476ab4d96e56c30db8170a8"    
meta-qt5             = "HEAD:5ef3a0ffd3324937252790266e2b2e64d33ef34f"
meta-agl-demo        = "HEAD:ed0546433050d3010b37a9c09e8f5a64f903d1f8"
meta-networking                                                       
meta-python                                                           
meta-filesystems                                                      
meta-oe              = "HEAD:8ff12bfffcf0840d5518788a53d88d708ad3aae0"
meta-agl-core                                                       
meta-agl-core-test
meta-agl-bsp         = "HEAD:d8761cb048b6c008b6e9ce53038b8a3a9986e84f"
meta               
meta-poky            = "HEAD:f14992950eb90dc168eb82823ab69538f668f8bc"
                                                                      
Initialising tasks: 100% |##########################################################################################################################################################################| Time: 0:00:00
Sstate summary: Wanted 7 Found 0 Missed 7 Current 447 (0% match, 98% complete)
NOTE: Executing Tasks                               
NOTE: Tasks Summary: Attempted 1461 tasks of which 1453 didn\'t need to be rerun and all succeeded.                                                                                                                 

The directory structure of the layer meta-offline-voice-agent looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ tree -L 3
.
├── conf
│   └── layer.conf
├── README
└── recipes-vosk
    ├── python3-srt
    │   └── python3-srt_3.5.2.bb
    └── python3-vosk
        └── python3-vosk_0.3.42.bb

The focus of this week and following ones is to:

  • create recipe for vosk-server.
  • continue researching about packaging required models.

Helpful Resources:

Development resources:
For issues encountered: