So I thought I’d try to make an automated audio-sync plugin for the video sequence editor. The idea being that you try to sync two audio strips by detecting a “clap” sound. It would be rather simple and fast to do in numpy.
Only problem: There is no way to access audio samples from Python! In fact it would be hard enough even for the C API. I somehow assumed there would be something like the pixels attribute in the Image datatype, but there isn’t. And even that property has the slightly annoying habit of copying and converting the whole image data whenever it is accessed.
There’s a gazillion ways to load and process sound data in Python. But depending on anything with a C-Extension beyond what blender provides in its standard library is next to impossible. Yes, it may work with builds specialized for a certain operating system / distribution. It may even work with Windows if you happen to know which files to copy where. But describing such an installation process is daunting, at best. It gets even worth if the addon should work with graphicall or goosebeerry builds of blender.
The only way I currently see is through an external ffmpeg process, which has to be installed/configured too. It may be hard to replicate all the sequencer logic to deal with partial strips, and even harder to deal with Metastrips, Audio from Scenestrips and so on…
So if somebody has a good idea or found some piece of documentation/code I overlooked, please share your thoughts!