Internet plagiarism is widespread and extensive

Content theft is a problem for content creators on the web and social media platforms. Some estimates say duplicate web page content makes up 30% of the web. It is also a widespread problem in academia. 

Subscribe
Notify of
4 Posts
Most Voted
Newest Oldest
Inline Feedback
View all posts
Tyler Mendoza
Novice
September 3, 2021 1:43 pm

Content theft on social media is a fascinating phenomenon. This is primarily because the benefits of stealing content are vague and high-risk. So why do people do it? Granted, if you copy a well-written paragraph from somewhere and post it as Facebook status, you will probably get some reputational benefit in the form of likes and comments from your friends. The reputational risk is high though, especially if you’re not much of a writer. The original sources can be found quickly and easily, leading to an embarrassing call out and swift deletion of your status!

Tweet stealing is a widespread phenomenon. Some people do it to appear witty or intelligent to their followers, and to get new followers. Sometimes it is a straight copy and paste job. Other times the stealer will make the effort to alter the original tweet, keeping the punchline and overall message the same. Tweetdecking, however, is the deliberate stealing of tweets for monetization. A Tweetdecker is someone who steals tweets, and by doing so amasses a large number of followers. By virtue of their large following, the Tweetdecker gets approached by brands to do sponsored tweets, allowing them to earn a handsome income.

It’s the stealing of content on social media for no discernible benefit that reflects a pernicious trend. We’re addicted to likes. And it’s not healthy. This happens a lot on popular YouTube videos. What is the benefit other than the endorphin rush you get from people appreciating your comment? You get little reputational benefit because you’re using a username and users don’t accrue the total number of likes on their profile.

It’s almost as if a new social currency is emerging. A currency that provides temporary validation. Are we becoming that desperate for validation that we steal comments in the hope we’ll get a lot of likes? The answer to this is yes. There are so many examples of this on YouTube. Look at a comment that has received a lot of likes. Often the user will edit their original comment and add something like, “Edit: OMG, thank you for 1,000 likes! Didn’t expect that at all!” The high that people get from lots of likes is enough for people to steal comments. Plagiarism is an afterthought.

stolen comments.png
Kaitlyn Mora
Novice
September 7, 2021 9:57 am

In high school I remember a funny incident when one of my teachers was irate. One of his students had submitted an essay assignment which was clearly plagiarised. How did he know? Because the assignment had a line that said click here for more information. The student had copied an entire webpage and submitted it as his assignment. A rookie error.

One of the problems with plagiarism now is that it is a constant battle to keep on top of the tech. Just like with cyber security, hackers and security experts are jockeying for position to one-up each other. In that war it feels like hackers are winning, exposing a vulnerability and cyber security experts reactively scrambling to patch it up.

There is a lot of anti-plagiarism software in the market. Grammarly, Copyleaks and Duplichecker are some well-known programs. But they have to keep on top of constantly evolving scrambling software. Scramblers allow you to take someone’s work and alter the words and sentence structure for it to pass any plagiarism checks.

Another problem is that writing software that is intended to help with your creativity can be used to pass plagiarism checks. In an r/UnethicalProLifeTips Reddit thread, a user points to Quillbot as software that helped him copy work and get past plagiarism checkers. The Lead Developer of Quillbot chimed in that although plagiarism wasn’t their objective, they recognize that it will be an inevitable application of their tech.

We also have to bear in mind that not all education and school budgets are created equal. Anti-plagiarism software can add to an already strained budget. When you are struggling to squeeze teaching material and resources into your allocated budget, anti-plagiarism software becomes a secondary consideration. Tackling plagiarism becomes more troublesome when attempting to stamp out the root cause. In India, for example, academic excellence is a cultural aspiration. In a research paper, Internet and Increasing Issues of Plagiarism, the authors point to the pressure of getting good scores in exams and high expectations of parents that incentivise students to plagiarise work. Stamping out plagiarism will be an uphill battle given the abundance of information the internet provides and the social and cultural burdens students bear.

Quillbot.PNG
Murray Hinton
Novice
September 6, 2021 3:14 pm

Internet plagiarism is a complex problem. A lot of it comes down to varying perceptions of what the right thing to do is. Although content is routinely stolen, those who are doing it aren’t always aware that it’s wrong. There is a large-scale assumption that because content is put on the internet, it is fair game for anyone to take it and use it for their own purposes. The Digital Millennium Copyright Act might have something to say about that.

In the fitness industry, we have seen this played out between two content creators, Jeremy Ethier and (Brownie) Boo. Boo is a content creator with over 30,000 Instagram followers who shares low calorie recipes. Jeremy Ethier runs a YouTube fitness channel with almost 4 million subscribers.

In June 2021 Jeremy got in touch with Boo to say he had tried one of her brownie recipes and loved it. He then went on to say that he was planning to include the recipe in a forthcoming cookbook that he’d be selling. Jeremy explained that the recipe would be attributed to Boo and he was happy to have a link to her Instagram page included next to the recipe.

The problem here is that Jeremy was planning to include someone else’s work in his cookbook, which he would be selling for profit. He hadn’t asked the original creator, Boo, if he could include her recipe, but rather he told her it was in the book. In Jeremy’s defense, perhaps he thought attribution in the cookbook would be a suitable payment. His judgement may have also been clouded by the incessant usage of other people’s content in the YouTube fitness arena. But this didn’t fly with Boo. She had put in a lot of work to create a unique recipe and it was being taken from her to line the pockets of someone else.

Most viewers are on Boo’s side of this scandal, labelling it as an attempt from a larger creator to bully a smaller creator. Boo hasn’t backed down and summarized Jeremy’s offer as follows: “Let me get this straight: You and your brand profits off other people’s hard work, without asking for their consent, and pays them in exposure?” In hindsight, a better way to do this would have been for Jeremy to ask Boo if he could include her recipe in the cookbook and to discuss payment options, whether it would have been in the form of exposure or a percentage of book sales.

A good example of using other people’s content the right way is Andy Morgan, an online fitness coach and author who founded the websites rippedbody.com and athletebody.jp. In 2011 the fitness industry was full of ‘broscience’, or in layman’s terms, full of misinformation. Actually the sad reality is that there is still a ton of misinformation, but that’s an insight for another time. There were a few beacons of shining light back then such as Martin Berkhan, founder of Leangains, and Mark Rippetoe, founder of Starting Strength. Andy got in touch with both of them, asking if it would be ok to translate their work into Japanese, as he had been living in Japan and saw how full of nonsense the industry was there as well. They both agreed and this started his journey of translations that would become the “broscience counter-balance” in Japan. His websites are now frequently visited and Athletebody.jp is one of the most visited fitness-related websites in Japan. Not a bad job for someone who went about using other people’s work the right way.

Josef Lind
Novice
September 3, 2021 6:22 pm

It is important to clarify that duplicate web content does not automatically equate to copied content. A website could have http:// and https:// versions of the same webpage and would be identified as duplicate content. Nonetheless we assume a large chunk of content on the internet is copied even if we can’t put an exact figure to it.

Web scraping is the tool of choice in this regard. A web scraper is typically a bot that crawls the world wide web and extracts information from web pages, including text and images. Web scraping is the ideal software for someone who has no qualms about copying other people’s content and putting it on their own website.

In the sphere of the internet, I see much less accountability. In academia it is a different ball game. If you are a student caught plagiarising in your dissertation or thesis, there is a big chance of being expelled from your degree program. On the internet, perhaps free access to an article on a website without a paywall or subscription gives people the impression that anything can be done with that article; including copying it word-for-word and placing it on your own website.

One blogger I know had a problem with her posts being scraped and reposted on other websites. She contacted the website owners, asking them to take down the copied posts. Their response was that they had done nothing wrong and were spreading her good content. This is a curious response. The more probable reason for copying her posts was to profit from her work. This would explain why there were ads all over the site where her content was reposted.

LinkedIn is another platform where I see little accountability when plagiarising content. Anyone who uses LinkedIn enough will have seen posts that have a generic formula. Here’s an example I’ve made up. They tend to go like this:

I was hiring for a new position. The candidate arrived 10 minutes late, looking tired and disheveled. The interview panel was not impressed. I asked the candidate why he was late. He told me he had saved a cat from a tree, saved a baby from a burning building and walked 50 miles for the interview. I hired him on the spot! He has been the best performer in the team for 3 months in a row. Never judge someone on their tardiness and appearance.

These posts are cringe-worthy. Now don’t get me wrong. They perform well, very often going viral on LinkedIn. If people get motivation or inspiration from them, then that is a positive thing. They are just not for me. My problem is how often I see the same post word-for-word going viral from different LinkedIn users. Once I saw a user who had copied one of these posts and was getting good engagement. In the comments, some said it was a copied post. The author replied that they didn’t know who the original author was, but wanted to share the message.

I am not convinced. If you are copying a post word-for-word and you don’t know the original author, either don’t post it or make it clear that it is not your work. My feeling is that the poster wanted the engagement and positive comments, which they got a lot of. But there was no sense of accountability. One explanation is that they saw it being copied and re-used so often that there would be no harm, and no accountability, for doing the same.

LinkedIn post.PNG