Skip to content

OpenRefine cannot lookup properties in Wikibase.cloud #1

@dpriskorn

Description

@dpriskorn
Image Image I get a 200 with the content below which is not what we expect

Seems to be a recently added bot-detection by wikibase.cloud.
"Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works."

OpenRefine stumbles on this so we need a way to bypass it.

<!doctype html><html lang="en"><head><title>Making sure you&#39;re not a bot!</title><link rel="stylesheet" href="/.within.website/x/xess/xess.min.css?cachebuster=v1.21.3"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="robots" content="noindex,nofollow"><style>
        body,
        html {
            height: 100%;
            display: flex;
            justify-content: center;
            align-items: center;
            margin-left: auto;
            margin-right: auto;
        }

        .centered-div {
            text-align: center;
        }

        #status {
            font-variant-numeric: tabular-nums;
        }

        #progress {
          display: none;
          width: 90%;
          width: min(20rem, 90%);
          height: 2rem;
          border-radius: 1rem;
          overflow: hidden;
          margin: 1rem 0 2rem;
					outline-offset: 2px;
					outline: #b16286 solid 4px;
				}

        .bar-inner {
            background-color: #b16286;
            height: 100%;
            width: 0;
            transition: width 0.25s ease-in;
        }
    	</style><script id="anubis_version" type="application/json">"v1.21.3"
</script><script id="anubis_challenge" type="application/json">{"rules":{"algorithm":"fast","difficulty":4,"report_as":4},"challenge":"b7fdb4c66ff457fee0e8960d6e82dd54e64bc96ea8063460d343217fed5a93944ae57bebe3c6c81f549b47170de16296295da1e5b22d28f924d9376da9aafa2294b3e628679fa0e2e1d52d140ee471a78a5bd7b10ce018466e83b27383adf88e8cb82a3ba081ec931d3072107309b51af1926ce15a9f69aa3aed04cd9b55903faae421c9f007a216d34890bdfa44e7b58dab21ed2685d3384f1e8329fbb40e88f9b60a50c04597f481dec3f097b6dc872fc52c0d0ccd369298533108ff5676711125c0881ebde1689c21b059c99499f743a66c7942f99ba9e05d03c8e2ce70a98a75f23fb7e76be78d9c4c4c3bb7e7a028d4a57751bca02ad5292b3807f4c808"}
</script><script id="anubis_base_prefix" type="application/json">""
</script></head><body id="top"><main><h1 id="title" class="centered-div">Making sure you&#39;re not a bot!</h1><div class="centered-div"><img id="image" style="width:100%;max-width:256px;" src="/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=v1.21.3"> <img style="display:none;" style="width:100%;max-width:256px;" src="/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=v1.21.3"><p id="status">Loading...</p><script async type="module" src="/.within.website/x/cmd/anubis/static/js/main.mjs?cacheBuster=v1.21.3"></script><div id="progress" role="progressbar" aria-labelledby="status"><div class="bar-inner"></div></div><details><summary>Why am I seeing this?</summary><p>You are seeing this because the administrator of this website has set up Anubis to protect the server against the scourge of AI companies aggressively scraping websites. This can and does cause downtime for the websites, which makes their resources inaccessible for everyone.</p><p>Anubis is a compromise. Anubis uses a Proof-of-Work scheme in the vein of Hashcash, a proposed proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.</p><p>Ultimately, this is a hack whose real purpose is to give a &#34;good enough&#34; placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering) so that the challenge proof of work page doesn&#39;t need to be presented to users that are much more likely to be legitimate.</p><p>Please note that Anubis requires the use of modern JavaScript features that plugins like JShelter will disable. Please disable JShelter or other such plugins for this domain.</p><p>This website is running Anubis version <code>v1.21.3</code>.</p></details><noscript><p>Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works. A no-JS solution is a work-in-progress.</p></noscript><div id="testarea"></div></div><footer><div class="centered-div"><p>Protected by <a href="https://github.com/TecharoHQ/anubis">Anubis</a> From <a href="https://techaro.lol">Techaro</a>. Made with ❤️ in 🇨🇦.</p><p>Mascot design by <a href="https://bsky.app/profile/celphase.bsky.social">CELPHASE</a>.</p></div></footer></main></body></html>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions