20200917

Wildcard selectors in uBlock Origin

The extension uBlock Origin for Chrome and Firefox allows users to block offensive elements on websites.

It works by matching the name of elements on a page against a block list, but some websites randomize element IDs, making this tricky. One can block elements with parts of their ID randomized in uBlock, but it is not automatic. First:

  • Have uBlock Origin installed
  • Identify element to block
  • Right click and select "block element"
  • uBblock will highlight the element, and a white box should appear in the lower right corner of the browser.

If the website is randomly-generating IDs, you won't be able to block the element with a fixed rule. You might see something like this:

##[randomly generated string]_[some predictable unique tag]

or

##[some predictable unique tag]_[randomly generated string]

Now, you might think that these could be blocked by "##*_[some predictable unique tag]" or "##[some predictable unique tag]_*", but uBlock does not use regular wildcard syntax. Instead, it uses CSS selector syntax, which is idiosyncratic. For example, this Reddit post suggests transforming the desired wildcard selector

###leaderboard_ad-*.block_article_ad.leaderboard_ad.etc

Into a rule expressed in terms of a CSS selector

##.block_article_ad.leaderboard_ad.etc[id^="leaderboard_ad-"]

This (1) Specifies the parts of the rule that are fixed, downstream of the wildcard, and (2) adds a string-matching query that finds elements whose ID starts with "leaderboard_ad-" (that's what "^=" means).

This post corroborates this approach, suggesting a generic format if the offending element has an ID starting with a predictable string:

www.annoyingwebsite.com##div[id^="start_of_div_name_before__randomnumber"]

This works, provided you can uniquely match the element based on a prefix. What if the unique identifier is a suffix? Referring back to CSS selector synatx, we find that ^= means "begins with" and $= means "ends with". So the wildcard rule for an element that randomizes the end of a div's ID would be:

www.annoyingwebsite.com##div[id$="_end_of_div_name_after_random_number"]

No comments:

Post a Comment