Content Moderation Case Studies: Copyright Claims On White Noise (2018)

from the white-noise-is-public-domain dept

Summary: Every platform hosting user generated content these days is pretty much required (usually by law) to have policies in place to deal with copyright-infringing material. However, not all content on these platforms is covered by copyright, and that can potentially lead to complications, since policies are often built off of the assumption that everything must be covered by some form of copyright.

Australia-based music technologist Sebastian Tomczak, who has a PhD in computer generated music, created from scratch a 10 hour “low level white noise” recording, which he placed on YouTube. He created the file himself, then made a video version of it, and posted it to YouTube. In early 2018, he discovered that there had been five separate copyright claims on the video from four separate copyright holders.

Each of the claims argued that other videos of white noise held the copyright on white noise, and that Tomczak’s video infringed on their own. Amusingly, each claim designates which short segment of the 10 hour video infringes on their own work -- even though the entire 10 hours is literally the same white noise.

None of the claims demanded that Tomczak’s video be taken down, but rather sought to “monetize” it under YouTube’s ContentID offering, which allows copyright holders to leave up videos they claim are infringing but divert any advertising revenue to the copyright holder.

Somewhat incredibly, one copyright holder claims that Tomczak’s video infringes on two separate videos of their own, both of which also offer white noise.

One company involved – Catapult Distribution – say that Tomczak’s composition infringes on the copyrights of “White Noise Sleep Therapy”, a client selling the title “Majestic Ocean Waves”. It also manages to do the same for the company’s “Soothing Baby Sleep” title. The other complaints come from Merlin Symphonic Distribution and Dig Dis for similar works .

It appears that all of the claims were automated claims, using various services that scan videos for similarities. However, it does not appear that any of those services first check if the originating videos actually involve a valid copyright in the first place. Instead, they often are based on an entire account, and just search for any similar videos, whether or not there is a valid copyright.

Decisions to be made by YouTube:

  • Is white noise even covered by copyright?
  • Should the platform allow users to claim the monetization rights on other similar videos in which there is no valid copyright?
  • If there are multiple copyright claims (and monetization claims) on the same video, how is it determined who has the rights and who gets to monetize?
  • Should automated systems be allowed to make copyright claims without any regard to actual copyright status?
Questions and policy implications to consider:
  • If copyright laws and policies are built on the assumption that every piece of content is covered by copyright, how should internet websites deal with situations in which there does not appear to be a valid copyright?
  • What are the long term implications of automated systems that do not involve any actual lawyers or experts reviewing either copyright takedown or monetization requests?
Resolution: Tomczak seemed to find the situation more amusing than anything else and noted that he’d received a few similar notices in the past. He expected that after contesting these claims, YouTube would likely drop them:

“In any of the cases where I think a given claim would be an issue, I would dispute it by saying I could either prove that I have made the work, have the original materials that generated the work, or could show enough of the components included in the work to prove originality. This has always been successful for me and I hope it will be in this case as well.”

Indeed, a few days after he contested the claims (and those claims received widespread press attention), YouTube did release all of the claims on the white noise video. Tomczak has separately argued that this case -- even with the final outcome -- suggests that parts of the system need to change.

"Hopefully cases like these with the white noise, which shows how sort of broken their copyright system is, can shed some light on it or get YouTube to think about changing their system," he said.

Originally posted on the Trust & Safety Foundation website.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: content moderation, contentid, copyright, dmca, white noise
Companies: youtube


Reader Comments

Subscribe: RSS

View by: Thread


  • identicon
    Anonymous Coward, 18 Dec 2020 @ 4:06pm

    Is white noise even covered by copyright?

    Is white noise even covered by copyright?

    Compendium of U.S. Copyright Office Practices (3rd ed.)
    Chapter 300 - Copyrightable Authorship: What Can Be Registered

    313.2 Works That Lack Human Authorship

    [T]he Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.

    Examples:

    • A claim based on a mechanical weaving process that randomly produces irregular shapes in the fabric without any discernible pattern.

    reply to this | link to this | view in chronology ]

    • icon
      Samuel Abram (profile), 18 Dec 2020 @ 4:59pm

      Re: Is white noise even covered by copyright?

      The problem with that is that Dr. Sebastian "Little Scale" Tomczak is Australian, and would have to be governed by Australian copyright laws instead of American ones.

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 18 Dec 2020 @ 5:06pm

        Re: Re: Is white noise even covered by copyright?

        … would have to be governed by Australian copyright laws…

        Not in the United States. A foreign copyright in the United States is still governed by U.S. copyright law. Material that is not copyrightable within the U.S. is still not copyrightable in the U.S.

        For instance, it doesn't matter where Naruto is domiciled. The macaque does not have an enforceable copyright in the United States. Period.

        reply to this | link to this | view in chronology ]

        • icon
          Samuel Abram (profile), 18 Dec 2020 @ 5:10pm

          Re: Re: Re: Is white noise even covered by copyright?

          Fair enough.

          reply to this | link to this | view in chronology ]

        • icon
          ECA (profile), 19 Dec 2020 @ 12:16pm

          Re: Re: Re: Is white noise even covered by copyright?

          The problem here comes with the RIAA and MPAA spreading around the world.
          It wasnt BAd in the past, when someone could jump to japan and get a Bunch of Anime CHEAP, or INDIA and Bollywood, and bring them to the USA, and we didnt worry about OTHERS CR. AND THEY DIDNT WORRY ABOUT OURS.
          And how tech got spread around the world.

          For some reasoning, this is whats happening NOW with international CR, and the USA corps Bitching that China is stealing Them.
          China has been BUYING TONS of them from around the world, and Most USA corps with CR, have to use it/Supply it to have things made. Which means China gets a copy. What happens after that Point is interesting.

          reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 18 Dec 2020 @ 5:13pm

        Re: Re: Is white noise even covered by copyright?

        See 17 U.S. Code § 104 - Subject matter of copyright: National origin

        Also see Naruto stories at Techdirt.

        reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 19 Dec 2020 @ 10:08pm

      Re: Is white noise even covered by copyright?

      Irrelevant question.

      Copyright trolls only care about money and control. In this case money. They want profits, and because they can leverage ContentID to send profits their way for nothing, they do so. It doesn't matter that the content isn't copyrightable. What matters is that the ContentID system will gladly give them money if they demand it. So long as the benefit of them doing this outweighs the penalties, they will write off the penalties as the cost of doing business.

      reply to this | link to this | view in chronology ]

  • identicon
    David, 18 Dec 2020 @ 4:29pm

    Signal processing police calls.

    even though the entire 10 hours is literally the same white noise

    If there are substantial "literally the same" parts, we are not talking about white noise. Its features may be governed by the same statistics, but that's not the same as being the same.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 18 Dec 2020 @ 4:36pm

      Re: Signal processing police calls.

      From the (linked above) TorrentFreak article:

      “… The video was created by generating a noise waveform of 10 hours length using the freeware software Audacity and the built-in noise generator. The resulting 10-hour audio file was then imported into ScreenFlow, where the text was added and then rendered as one 10-hour video file,” [Sebastian Tomczak] explains.

      reply to this | link to this | view in chronology ]

      • identicon
        David, 18 Dec 2020 @ 5:26pm

        Re: Re: Signal processing police calls.

        And your point was? That the whole sequence is governed by the same statistics? That's not the same as being the same.

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 18 Dec 2020 @ 5:34pm

          Re: Re: Re: Signal processing police calls.

          And your point was?

          You may read Sebastian Tomczak's explanation for yourself. You may also consider whether the facts that he has related there are consistent with your previous understanding.

          Or you might persist in being obtuse.

          reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 19 Dec 2020 @ 3:29am

      Re: Signal processing police calls.

      You are correct in that two samples of random noise will not be literally the same. Use of the word literally here is the classic definition that does not include figuratively.

      If one were to look at the characteristics of each sample, one could make the claim that they were characteristically the same. They used the same pseudo random number generator and the same application with the same inputs, etc.

      It's not the same as being the same - lol

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 19 Dec 2020 @ 3:47am

        Re: Re: Signal processing police calls.

        They used the same pseudo random number generator

        If the run (recording) time is long enough, those generators repeat their output; as they do if started with the same initial values. I suspect 10 hours is long enough for several repeats of the generated sequence.

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 19 Dec 2020 @ 4:45am

          Re: Re: Re: Signal processing police calls.

          Actually, repeating over a 10 hour period is unlikely. Assuming a 32 bit PRNG, we have a period of 4,294,967,296 cycles. Assuming CDROM quality of 44,100 samples per second, that would take 97,391.55 seconds before repeating. Or about 27 hours. If a larger PRNG was used, the the period would be larger. For instance, the period of a 64 bit generator would be over 13 million years.

          reply to this | link to this | view in chronology ]

          • identicon
            Anonymous Coward, 19 Dec 2020 @ 5:56am

            Re: Re: Re: Re: Signal processing police calls.

            Assuming a 32 bit PRNG…

            The glibc sources are, of course, widely available. Here are current sources for:

            From the comments on lines 67-69 of random.c (which are repeated on lines 68-70 of random_r.c):

            By default, the package runs with 128 bytes of state information and generates far better random numbers than a linear congruential generator.

            Do browse the source to see the rest of that comment, including the discussion of the period of the generator.

            reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 19 Dec 2020 @ 6:55am

              Re: Re: Re: Re: Re: Signal processing police calls.

              The glibc sources are, of course, widely available.

              Although, looking more carefully at Sebastian Tomczak's explanation, and particularly with regards to his use of ScreenFlow, it appears likely that he was running that software on a Mac platform. Obviously, that does not necessarily mean that he ran Audacity on Mac. But it tends towards that guess.

              reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 19 Dec 2020 @ 6:57am

              Re: Re: Re: Re: Re: Signal processing police calls.

              And the point is?

              Given the sources you've indicated, the PRNG used has far more than 32 bits of state and therefore a period far exceeding the rather minimal period stated a few posts back. Using the default if 128 bytes and assuming one ling word used for management data, the period would be on the order of 2^960, or about 10^289. Call it about 10^277 years. And that's the low end. If those 64 bits of management data were used to increase the period, it has an upper limit of 2^1024, or 10^308, or about 10^296 years as the upper limit. In any case, 32 bits of state gives a period well beyond 10 hours and increasing the state size just increases the period to ludicrous durations.

              reply to this | link to this | view in chronology ]

              • identicon
                Anonymous Coward, 19 Dec 2020 @ 8:45am

                Re: Re: Re: Re: Re: Re: Signal processing police calls.

                Given the sources…

                Continuing to quote from the comment in random_r.c, now from lines 85-86:

                The total period of the generator is approximately deg*(2**deg - 1)

                Then looking down at the actual code on lines 121-2, it looks to me like a default of 128 bytes of state corresponds to a degree 31 polynomial(*). Plugging that back into the formula given in the comment would be 31*(2**31 - 1) or roughly 2**5 * 2**31 = 2**36.

                (*) Clearly, in a LP64 model, 128 bytes isn't going to store 31 longs sucessfully, although it will store 31 int32_t's. The comment about “longs” just seems to differ from the actual code in that model. Presumably the comment dates from an era when (I)LP32 was the prevalent model.

                reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 19 Dec 2020 @ 4:58am

          Re: Re: Re: Signal processing police calls.

          … those generators repeat their output

          A very quick browse through the current Audacity source turns up line 127 in noise.cpp:

          buffer[i] = mAmp * ((rand() / div) - 1.0f);

          At first glance, this looks to me like the white noise generator uses the system rand() function.

          But note that this is the first time I've ever looked at the Audacity source, and I may be mis-reading it horribly. And, even if I'm reading the current source correctly, this is certainly a later version than was used a couple years ago.

          reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 19 Dec 2020 @ 7:24am

          Re: Re: Re: Signal processing police calls.

          Modern pseudo random number generation includes the possible use of noise encountered in the hardware to augment the sequence in addition to other sources. The specific item used here may allow such options, I do not know, but the claim was not limited to this instance.

          reply to this | link to this | view in chronology ]

          • identicon
            Anonymous Coward, 20 Dec 2020 @ 3:56am

            Re: Re: Re: Re: Signal processing police calls.

            … noise encountered in the hardware to augment the sequence…

            When using the rand() function from the standard library, the C standard requires the sequence to be repeatable.

            Since the C standard documents are not freely available, here's a link for the final C11 committee draft. See “7.22.2 Pseudo-random sequence generation functions”. Or, more conveniently in this case, see the POSIX standard, since in this respect, POSIX is “aligned with the ISO C standard.”

            The srand() function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand(). If srand() is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand() is called before any calls to srand() are made, the same sequence shall be generated as when srand() is first called with a seed value of 1.

            (Emphasis.)

            I do recognize that your “claim” is much more hand-wavy and mushy about prng's in general, than intended to address this specific instance.

            But in this specific instance, where Audacity is using rand(), while non-repeatable hardware-generated randomness might be used to seed the prng, the standards practically prohibit hardware noise augmentation of the resulting sequence.

            reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 20 Dec 2020 @ 6:48am

              Re: Re: Re: Re: Re: Signal processing police calls.

              For what it's worth ....

              OP comment to which I replied:
              "If there are substantial "literally the same" parts, we are not talking about white noise."

              Sorry for the hand wavy and mushy text as you put it, but I was not addressing the specific usage in this case because the OP did not address the specific usage in this case.

              Subsequent comments pointed out that depreciated versions of pseudo random number generation repeat over time. I pointed out this particular problem has been addressed.

              reply to this | link to this | view in chronology ]

              • identicon
                Anonymous Coward, 20 Dec 2020 @ 8:27am

                Re: Re: Re: Re: Re: Re: Signal processing police calls.

                … versions of pseudo random number generation repeat over time.

                It's probably worth clearly distinguishing between repeatable sequences and repeating cycles within seqences.

                • A repeatable sequence may conceivably be infinite in length without any internal cycles. For example, the sequence of digits or bits of π can be generated and regenerated repeatably, out to any length, even though that potentially-infinite sequence contains no cycles within it.

                • On the other hand, again for example, a sequence generated by ordering a finite set of elements necessarily has fixed length. If that sequence is expanded out to a greater length by repeating it as a sub-sequence of the expanded sequence, then by construction, that expanded sequence contains repeating cycles.

                In many applications, neither of those attributes are necessarily “problems”.

                Especially for simulation applications, the capability to repeat a sequence in a later computation run has been considered important enough to require it in standards documents.

                In other applications, though, particularly cryptography, totally different qualities may indeed be more important. In cryptographic applications, usually it's most important that the sequence is, in some sense, “unpredictable”. Often, the desired cryptographic qualities of random numbers may have formal definitions that are ill-fit to any pseudo-random number generation process. There are quite a few cryptographic “problems”.

                For a white noise application, I'd think that the most important quality would be that the sequence is fairly gaussian-distributed in both time and frequency domains (within bandwidth limitations). Although, I do suspect that some rather non-gaussian distributions may sound “whitish” enough for casual listening.

                reply to this | link to this | view in chronology ]

                • identicon
                  Anonymous Coward, 20 Dec 2020 @ 8:39am

                  Re: Re: Re: Re: Re: Re: Re: Signal processing police calls.

                  gaussian-distributed in both time and frequency domains (within bandwidth limitations).

                  On re-reading, that is an extremely clumsy way to say flat frequency distribution over some band-limited region.

                  Sorry. That's what I meant.

                  reply to this | link to this | view in chronology ]

                • identicon
                  Anonymous Coward, 20 Dec 2020 @ 2:03pm

                  Re: Re: Re: Re: Re: Re: Re: Signal processing police calls.

                  It is also worth pointing out that random number generation is not limited to software based systems.

                  "In computing, a hardware random number generator (HRNG) or true random number generator (TRNG) is a device that generates random numbers from a physical process, rather than by means of an algorithm. Such devices are often based on microscopic phenomena that generate low-level, statistically random "noise" signals, such as thermal noise, the photoelectric effect, involving a beam splitter, and other quantum phenomena. These stochastic processes are, in theory, completely unpredictable, and the theory's assertions of unpredictability are subject to experimental test. This is in contrast to the paradigm of pseudo-random number generation commonly implemented in computer programs. "

                  Hardware random number generator

                  reply to this | link to this | view in chronology ]

  • icon
    That One Guy (profile), 18 Dec 2020 @ 4:36pm

    'You know what, looks like we were mistaken...'

    You could solve a massive amount of copyright related issues and abuses if you simply made the law equal, such that the penalties for issuing bogus claims were treated and punished just as harshly as claims of infringement.

    This would not only make people much more careful regarding what they claimed was infringement but by making the penalties equal it would also provide an incentive to bring penalties for infringement down to sane levels, because someone sending out claims would always have to face the possibility that they might be on the receiving end of the penalty.

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 18 Dec 2020 @ 4:48pm

    I remember when Techdirt reported on this. Needless to say, antidirt and the usual copyright advocates (read: fanatics) got their panties so twisted that someone might question the validity of copyright, they ended up rediscovering the Gordian Knot. The reason why you haven't heard about it is because they hold the copyright to it.

    reply to this | link to this | view in chronology ]

  • identicon
    Rekrul, 18 Dec 2020 @ 4:58pm

    A couple years ago, I filmed my back window, at night in the summer just to record how loud the insects were. I uploaded it to YouTube and was immediately informed that the audio had been muted due to a copyright claim.

    There was literally no other sound other than the insects and a slight hum from my computer's fans.

    reply to this | link to this | view in chronology ]

  • icon
    wereisjessicahyde (profile), 18 Dec 2020 @ 5:48pm

    Hmm

    If you click on the link to the video today it looks like it's been claimed by somebody else again... claimed by the creator a white noise video that was made 2 years later.

    It's insane.

    reply to this | link to this | view in chronology ]

    • identicon
      stine, 19 Dec 2020 @ 3:29am

      Re: Hmm

      The date of upload doesn't have to be the same as the date of creation/copyright.

      reply to this | link to this | view in chronology ]

      • icon
        wereisjessicahyde (profile), 24 Dec 2020 @ 2:34am

        Re: Re: Hmm

        I'm not talking about the date of upload. The other video says it was created in 2017, which is 2 years later. Apart from that it shouldn't have been claimed at all.

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 24 Dec 2020 @ 3:00pm

          Re: Re: Re: Hmm

          The other video says it was created in 2017, which is 2 years later.

          Copyright wasn't claimed in the time domain. It was claimed in the frequency domain.

           

           

          (Don't worry if you don't really get the joke—it's a pretty weak one, even for a technical joke. Just rate it at about 0.7 chuckle, if you need to ask.)

          reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 19 Dec 2020 @ 3:34am

    Apparently everything is now copyrighted (copywrote?) at this point in time and no one can create anything in the future due to copyright .... isn't this the opposite of what copyright is supposed to be doing?

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 19 Dec 2020 @ 6:02am

      Re:

      copyrighted (copywrote?)

      Surely copyrightten (copywritten).

      isn't this the opposite of what copyright is supposed to be doing?

      Surely copyright on white noise is needed to incentivize the creation of more white noise.

      reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Close

Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Special Affiliate Offer

Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads
.

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.