Code Comments
Programming Forum and web based access to our favorite programming groups.This is pretty: http://www.rochester.edu/news/show.php?id=3136 Researchers at the University of Rochester have digitally reproduced music in a file nearly 1,000 times smaller than a regular MP3 file. The music, a 20-second clarinet solo, is encoded in less than a single kilobyte, and is made possible by two innovations: recreating in a computer both the real-world physics of a clarinet and the physics of a clarinet player. The achievement, announced today at the International Conference on Acoustics Speech and Signal Processing held in Las Vegas, is not yet a flawless reproduction of an original performance, but the researchers say it's getting close. "This is essentially a human-scale system of reproducing music," says Mark Bocko, professor of electrical and computer engineering and co- creator of the technology. "Humans can manipulate their tongue, breath, and fingers only so fast, so in theory we shouldn't really have to measure the music many thousands of times a second like we do on a CD. As a result, I think we may have found the absolute least amount of data needed to reproduce a piece of music." In replaying the music, a computer literally reproduces the original performance based on everything it knows about clarinets and clarinet playing. Two of Bocko's doctoral students, Xiaoxiao Dong and Mark Sterling, worked with Bocko to measure every aspect of a clarinet that affects its sound--from the backpressure in the mouthpiece for every different fingering, to the way sound radiates from the instrument. They then built a computer model of the clarinet, and the result is a virtual instrument built entirely from the real-world acoustical measurements. The team then set about creating a virtual player for the virtual clarinet. They modeled how a clarinet player interacts with the instrument including the fingerings, the force of breath, and the pressure of the player's lips to determine how they would affect the response of the virtual clarinet. Then, says Bocko, it's a matter of letting the computer "listen" to a real clarinet performance to infer and record the various actions required to create a specific sound. The original sound is then reproduced by feeding the record of the player's actions back into the computer model. At present the results are a very close, though not yet a perfect, representation of the original sound. "We are still working on including 'tonguing,' or how the player strikes the reed with the tongue to start notes in staccato passages," says Bocko. "But in music with more sustained and connected notes the method works quite well and it's difficult to tell the synthesized sound from the original." As the method is refined the researchers imagine that it may give computer musicians more intuitive ways to create expressive music by including the actions of a virtual musician in computer synthesizers. And although the human vocal tract is highly complex, Bocko says the method may in principle be extended to vocals as well. The current method handles only a single instrument at a time, however in other work in the University's Music Research Lab with post- doctoral researcher Gordana Velikic and Dave Headlam, professor of music theory at the University of Rochester's Eastman School of Music, the team has produced a method of separating multiple instruments in a mix so the two methods can be combined to produce a very compact recording. Bocko believes that the quality will continue to improve as the acoustic measurements and the resulting synthesis algorithms become more accurate, and he says this process may represent the maximum possible data compression of music. "Maybe the future of music recording lies in reproducing performers and not recording them," says Bocko. This research is funded by the National Science Foundation. | | Mark Nelson - http://marknelson.us |
Post Follow-up to this message> This is pretty: > > http://www.rochester.edu/news/show.php?id=3136 > > Researchers at the University of Rochester have digitally reproduced > music in a file nearly 1,000 times smaller than a regular MP3 file. Well, that's surely a good work and goes a couple of steps into the right direction. But - to be honest - this is nothing really revolutionary! The MIDI standard has been proposed a *quarter century* ago, and MIDI also describes sounds (e.g. in form of musical terms) rather than using samples. Only a couple of years later, samples of instruments (like this clarinet) have been combined with such a "musical description", and file formats like MOD emerged. Such formats became very popular and achieved similar astounding "compression ratios". A proper method for a good 'wav2mod' converter is long overdue ... ;-) Christian
Post Follow-up to this messageAccording to Christian <iBBiS@gmx.de>: > Well, that's surely a good work and goes a couple of steps into the > right direction. But - to be honest - this is nothing really > revolutionary! The MIDI standard has been proposed a *quarter century* > ago, and MIDI also describes sounds (e.g. in form of musical terms) > rather than using samples. The "revolutionary" part is probably in the inverse transform, i.e. transforming an actual recording into a MIDI-like format. I can envision that this process is hard to do automatically, and it seems that it is what the scientists achieved here. Which is a great advance. If they can do it for non-musical human voice, then phone companies will be very interested. Huge amounts of money are just waiting to happen on the smart guy who will reduce bandwidth consumption for phone calls. --Thomas Pornin
Post Follow-up to this messageOn Apr 2, 10:41=A0am, Christian <iB...@gmx.de> wrote: > > > > Well, that's surely a good work and goes a couple of steps into the > right direction. But - to be honest - this is nothing really > revolutionary! The MIDI standard has been proposed a *quarter century* > ago, and MIDI also describes sounds (e.g. in form of musical terms) > rather than using samples. Only a couple of years later, samples of Yep, there's not doubt that the idea of modeling musical instruments has been around a long time. And of course, much speech compression depends on modeling of either the human vocal tract or other features of speech. The hard part here, of course, is what you call wav2mod - mapping the sampled music onto the model. An interesting question would be whether their analysis of the played music, when properly modeled, simply reproduces the physical actions of the person playing the tune. If that was the case, you could reverse engineer a recording of Benny Goodman playing Rhapsody In Blue and see exactly how he played it - fingering, breath, etc. Pretty crazy if it could be done. | | Mark Nelson - http://marknelson.us |
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.