[Image]
                                  [Image]
        [Click here.]

[Welcome to Slashdot][Linux][The Almighty Buck][Music][Linux Business][BSD]
 faq            Ogg Vorbis - The Free             Slashdot Login
 code         Alternative To MP3                  Nickname:
 awards       Posted by emmett on      [Music]
 privacy      Monday August 14,                   Password:
 slashNET     @10:04AM
 older stuff  from the blinded-me-with-science
 rob's page   dept.
 preferences  The fight to keep standards Open    Don't have an account
 andover.net  and Free is raging in the audio     yet? Go Create One. A
 submit story compression business. With mp3      user account will allow
 advertising  tearing up bandwidth and the court  you to customize all
 supporters   system, Christopher Montgomery and  these nutty little
 past polls   the rest of the Ogg Vorbis team     boxes, tailor the
 topics       are working hard to ensure that     stories you see, as well
 about        the mp3 format has a Free           as remember your comment
 jobs         alternative in their system, which  viewing preferences.
 hof          seems to outperform mp3 everywhere
              it counts. I got the opportunity    Related Links
  Sections    to pull Chris away from            * Slashdot
  8/9         development just long enough to    * Netscape
  apache      tell us exactly what's going on,   * More on Music
  8/14 (11)   and to answer some questions about * Also by emmett
  askslashdot the process and the product
  1/27        necessary to take on mp3.           Features
  awards                                          The latest installment
  8/8         Christopher Montgomery:             of Geeks in Space is up
  books                                           at The Sync. Listen to
  8/14        Vorbis is a hybrid time/frequency   CmdrTaco, Hemos, and
  bsd         transform coder like mp3, but the   Nate talk about the
  8/14        similarity really ends there; it's  latest events to happen
  features    more similar to TwinVQ in some      - or not happen in the
  7/28        ways (many shared mechanisms,       computer world.
  interviews  albeit used somewhat differently).
  6/22                                            Perhaps you are seeking
  radio       Like mp3 (and virtually every       Emmett's series of
  8/14 (2)    other useful transform coder), we   articles about making
  science     first look for strong changes and   music with Linux. These
  8/14 (3)    natural breaks in the input audio,  articles include We're
  yro         and can use this information to     Getting There,
              break up the incoming audio into
  OSDN        different sized blocks. When you    Mastering, Bandwidth,
  Freshmeat   lose information in the frequency   and Synthesis and
  Linux.com   domain, the resulting noise         Notation And Alphabet
  SourceForge spreads throughout the time         Soup.
  ThinkGeek   domain. A very strong spike in
  Question    time will get smoothed out by       For something different,
   Exchange   frequency quantization, so the      try reading the Jon Katz
              larger the block, the more audible  essay Showdown With The
              it is. You want to isolate these    Pinkertons about his
              strong, sharp events in smaller     encounter with the
              blocks.                             Pinkerton Special
                                                  Services Group.
              Past this point, the similarities
              with mp3 end. Vorbis can do a       Also, be sure to check
              time-domain pre-encoding using      out Katz's feature on
              wavelets to further reduce          Napster and Metallica,
              spreading of time events and        entitled Metallica's
              non-tone data. The current          "Justice" And Napster
              libvorbis doesn't have the code to
              do this yet, but the hooks are      Update: 05/02 05:10 by
              there for when we do finish this    CowboyNeal:
              code (this feature will be post
              1.0. Wavelets are still something              Past Features
              novel that no one else is using in
              serious production yet, and we
              need to do more real R&D before
              it's ready).

              Vorbis takes the time data
              directly to the frequency domain
              with an MDCT, where mp3 first
              subbands the data. The polyphase
              pseudo-QMF filter that mp3 uses
              for subbanding is not completely
              orthogonal; no matter how good the
              implementation, there will always
              be some aliasing. For this reason,
              Vorbis dispenses with subbanding
              altogether and just uses a large
              MDCT.

              Vorbis then computes line-by-line
              masking curves for local peaks,
              long-distance simultaneous tone
              masking, simultaneous noise
              masking and temporal masking.
              These curves are use to separate
              inaudible tones from audible
              tones, and then choose a frequency
              domain amplitude curve that
              represents the 'base energy' of
              that audio frame. The base energy
              curve (I call it a floor) is
              subtracted from the MDCT data
              (like a whitening filter), which
              produces 'frequency residue'. The
              floor is converted to an LSP (line
              spectral pair) representation and
              then it and the MDCT residue are
              vector quantized into the final
              output codewords by a cascade of
              custom VQ codebooks that are
              packed along in the header of the
              bitstream. The result is one
              vorbis audio packet.

              The audio packet is them embedded
              into an Ogg bitstream page and the
              page (when full of packets) is
              shipped out in the stream.

              The decode side does the reverse,
              but without all the masking
              analysis. We extract the string of
              packets from the Ogg bitstream,
              and for each packet unpack the
              floor and residue, take the dot
              product and then do an inverse
              MDCT to recover the time-audio
              frame. Each frame is lapped and
              added to the previous frames and
              we get the original audio out.

              Very simple, see? :-) To be fair,
              the masking analysis is the only
              real black magic. What I'm doing
              is almost entirely based on the
              masking curve data published in
              the late 50's by Robert Ehmer.

              One thing the current release of
              Vorbis does not have is channel
              coupling (like mid-side stereo,
              although we'll be doing it
              differently). Beta 1 and beta 2
              actually include multiple totally
              separate channels. The fact that
              we equal and better mp3's quality
              missing this huge piece is
              exciting. Mid/side stereo in mp3
              drops the final bitrate of a
              stereo stream by 30-50kbps. To get
              a real comparison of Vorbis vs.
              mp3, compare mono streams or force
              the mp3 encoder not to use
              joint/intensity stereo (eg, -m m
              in LAME 3.84). Vorbis at 56kbps
              mono beats mp3 at 80kbps. At equal
              bitrate there's no comparison at
              all.

              Slashdot:For those just tuning in,
              what's the project all about, and
              how did it get started?

              The Vorbis codec is a lossy audio
              compression codec similar to mp3,
              but we're shooting for better
              performance (lower bitrates for a
              given level of quality) as well as
              keeping it totally Free as in Beer
              and Speech. I started work on
              Vorbis a week or two after
              Fraunhofer sent out 'cease and
              desist' letters to several free
              mp3 encoder projects in the fall
              of '98. At that point, it was
              clear the worst case was
              happening; the squeeze was on by
              commercial entities to not only
              dominate the legal distribution of
              music, but the underlying
              technology as well. A 'free
              license' to owned technology means
              nothing (and that's why Real and
              Windows Media are also worthless
              as infrastructure to us).

              Fraunhofer (and MPEG in general)
              and the RIAA are also a bit too
              friendly behind the scenes, if not
              entirely in bed together. If you
              really believe SDMI is about
              protecting the artists, well, I
              have some wonderful Oklahoma
              beachfront property for sale at
              prices that are a steal, but you'd
              better act fast!

              It's ironic that at the same time
              mp3 has been an agent to open up
              music distribution, it's becoming
              a tool for commercial interests to
              reclaim control. If online music
              is to fulfill its potential, an
              oligarchy can't be allowed to
              control its distribution or the
              technology behind it. The Internet
              would not have reached critical
              mass if it was a product of
              Microsoft or AOL or Oracle... It
              wouldn't ever have happened.
              Corporate control of every facet
              of online music will just strangle
              it in the cradle. The inventors of
              the Internet 'gave it away,' and
              that's been a great thing for
              business. However, the important
              lesson here is that the
              foundations were set in stone and
              wrought from iron before any
              company had self-interested
              influence. TCP/IP (brought to you
              by research laboratories) is
              elegant and farsighted; it's taken
              thirty years for it to begin
              wearing thin. E-mail is similarly
              brought to you from academia.
              HTML, on the other hand, (as
              ultimately brought to you by
              Netscape and Microsoft) makes good
              engineers weep and gnash their
              teeth.

              We need to have unbreakable free
              music foundations in place before
              letting the commercial interests
              have their way with the
              infrastructure. I wouldn't rely on
              any infrastructure they build
              themselves.

              Ogg and Vorbis are trying to
              continue the principles for which
              we in the open world see mp3
              standing.

              Slashdot: What are you working on
              right now?

              Vorbis second beta. General
              quality improvements, additional
              bitrate modes in the encoder
              (96-350kbps stereo, mono modes),
              bugfixes, etc. After beta 2 (look
              for on Tuesday at about the time
              LinuxWorld Expo in San Jose
              opens), we have low bitrate modes
              to finish, channel coupling (joint
              stereo and joint surround) and
              constant bitrate modes (Vorbis by
              default is VBR).

              Others in the project are working
              on tools... Mike Smith, Kenneth
              Arnold and others are knee deep in
              utils, Jack and Chad of Icecast
              are adding Ogg streaming to
              Icecast, Ralph Giles and Rob Kaye
              are working on stream mixing,
              metadata streams (Ralph is also
              hacking on MNG over Ogg). Kim,
              Tori and Emily at iCast are
              writing documentation...

              The project has also outgrown our
              group. There are now Vorbis news
              sites (like govorbis.com and
              vorbiszone.com), an all-vorbis
              music label (vorbisonic.com) and
              other vorbis related sites poppin
              up. angrycoffee.com is working on
              Vorbis tutorials for beginners.

              Within the core team, we need to
              get more people who are up on
              signal processing aspects like in
              the community around LAME.

              Slashdot: Is this your full-time
              thing?

              Yes. Ogg and Vorbis development
              are sponsored by iCast and they're
              also deploying it internally. In
              addition to paying salaries,
              they're pitching it to the
              industry and providing legal
              assistance.

              Slashdot: Xiphophorus is a
              collection of people, projects and
              tools. What's going on with the
              collective?

              Vorbis is a 'serious' project now,
              so we're expensing the massive
              espresso consumption ;-) The few
              of us who are now getting paid to
              do this can afford to be extremely
              intense about it. Other
              contributors still come and go.
              Right now, we're all pretty much
              focused on Ogg Vorbis; I have to
              apologize to all the cdparanoia
              users out there. I'll be working
              on it again in the future, but
              right now I only have so many
              cycles.

              Ogg and Vorbis are currently
              getting more outside attention
              than we can really gracefully
              handle (well, handle and still get
              work done at the rate we're used
              to, which was still always slower
              than we want ;-) Apparently
              someone on some list claimed
              'Vorbis was dead' because we
              hadn't updated the Web site in a
              month. Ha! If we were 'dead' we'd
              have plenty of time to write HTML
              :-) And answer mail. Anyone who
              sent me personally mail in the
              past month and a half, I'll answer
              it eventually, I promise...

              Slashdot: Are you out to replace
              mp3 as the sound format of choice?
              If not, why not, and if so, what
              are the challenges?

              We're out to keep things Free
              (capital F intentional). If MPEG
              turned around and made the mp3
              spec and patents public domain,
              we'd definitely declare victory
              (and then continue coding to
              improve Vorbis). But we all know
              that isn't going to happen. More
              likely, if Fraunhofer decides
              we're a threat, they'll just delay
              licensing (remember kids: free
              licenses to binaries aren't worth
              jack) until the competition dies
              down. Then they'll squeeze again.

              Honestly, I don't think we're
              going to 100% replace mp3 (people
              still use RAR for Christ's sake).
              I lay better than even odds on us
              eclipsing mp3 in the next year if
              the licensing picture stays the
              same. We also intend to have 80-96
              kbps stereo streams that sound
              better than mp3 128 by that point,
              so people (and businesses) won't
              exactly have to give anything up
              to save money. Also expect
              hardware support soon, possibly by
              end of year if things go smoothly.

              Slashdot: You talk a lot on your
              Web site about Open software.
              Which came first, the desire to
              deliver multimedia, or the drive
              to develop it openly?

              My real hacking skills germinated
              at the MIT Lab for Computer
              Science. I'd coded practically all
              my life before getting to MIT, but
              I'd always been the best coder I
              knew, so I hadn't really learned
              much. When I got to MIT, I didn't
              feel stupid but it drove home that
              I had a lot of catching up to do.
              Most of my mentors were from the
              previous generation (all open
              source people) but a few of the
              very hardcore people were younger
              than me, too.

              I've been a musician all my life
              too, albeit not a very good one (I
              feel a bit like Soliari in
              Amadeus) and Ogg was born in '93
              when I bought a 1 Gig hard drive
              and a sound card and thought 'this
              is unlimited space! I can put
              music on this! And do things with
              it!'. I quickly found out that a
              Gig wasn't unlimited by a long
              shot, not even in '93 (I filled it
              with mail eventually), so I
              started muddling with compression.
              Greg Hudson made an offhand remark
              about there not being any good,
              free, music compression libs at
              the time, and Squish was born. I
              got a letter from a lawyer a few
              months later politely informing me
              that 'Squish' was a registered
              trademark and if I didn't change
              the name of my software, I could
              forget ever owning anything in the
              Western World ever again. Mike
              Whitson renamed the codec
              'OggSquish'. The Ogg project was
              born. Oh, and we plan to release
              an updated Squish codec again
              sometime in the next year.

              <  'Gnome Foundation' Takes Aim at
              MS Office | @Home Stops Allowing
              VPNs  >


                'Ogg Vorbis - The Free Alternative To MP3' | Login/Create
                      an Account | 308 comments | Search Discussion
                                     Threshold:
               The Fine Print: The following comments are owned by whoever
               posted them. Slashdot is not responsible for what they say.


                                Garbage In, Gospel Out
  All trademarks and copyrights on this page are owned by their respective
       owners. Comments are owned by the Poster. The Rest � 1997-2000
                               Andover.Net.
  [ home | awards | supporters | rob's homepage | contribute story | older
     articles | Andover.Net | advertising | past polls | about | faq ]