Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up <link>'s "obtain a resource" Fetch and Resource Hints integration#1142

Open
Ms2ger opened this issue Apr 28, 2016 · 36 comments
Open
Labels

Comments

@Ms2ger
Copy link
Member

Ms2ger commented Apr 28, 2016

The semantics of the protocol used (e.g. HTTP) must be followed when fetching external resources. (For example, redirects will be followed and 404 responses will cause the external resource to not be applied.)

This no longer seems to have a place now we use Fetch.

@annevk
Copy link
Member

annevk commented Apr 28, 2016

I guess the 404 requirement might need to be turned into some "ok status" check on the return value? (Seems a bit dubious for "no-cors", though I guess the platform has several of those. On the other hand, @jakearchibald thinks this is a problem with AppCache I believe... Or maybe that's because AppCache has specific behavior for certain 4xx status codes.)

@jakearchibald
Copy link
Collaborator

Here's where we discussed this for service worker w3c/ServiceWorker#258

If we decide it's ok to expose this, I'd be happy to make it ok in service worker land & fetch too.

However, if a site 403s vs 200s depending on login state, it feels like something we wouldn't want to openly reveal.

@annevk
Copy link
Member

annevk commented Apr 28, 2016

So... At least <script> already distinguishes 2xx from other statuses even for "no-cors". I think some other elements might do the same, but only <script> is clearly defined at the moment (and I believe tested).

@annevk annevk added the clarification Standard could be clearer label Apr 28, 2016
@jakearchibald
Copy link
Collaborator

jakearchibald commented Apr 28, 2016

Media elements, <track> and importScripts also fail on an error-like status code. Seems like <link>, <object>, <script> should follow the same model (the spec only says to fail on 404 right now).

The only one that can be used to detect cross-origin 404s for any resource is <object>, since the others require a specific content-type or successful parsing. However, it doesn't appear that the browsers follow spec here:

http://output.jsbin.com/heyiqa/quiet

Chrome fires a "load" for 404s. Firefox & Safari do not fire a "load" event for 404s, but they don't fire an "error" event either.

@annevk
Copy link
Member

annevk commented Apr 28, 2016

Ah I see, so the attack vector is specific 4xx status codes, not treating 2xx specially. Maybe that argues for allowing "ok" on opaque responses coupled with Cache API support? Going off a bit into the weeds here...

@jakearchibald
Copy link
Collaborator

I'd be delighted with that, but I think it gives more visibility than we have today, so I'd like a second opinion from The Security Avengers.

For example, I'm not sure I can determine !ok for an HTML page in browsers today. I guess I could use <object> assume no "load" event within x seconds means !ok, that works in Safari & Firefox. (or I could use appcache)

@annevk
Copy link
Member

annevk commented Apr 28, 2016

Yeah, maybe it does, depends on the precise processing model of the various elements and that's likely not generic enough.

@bzbarsky
Copy link
Contributor

Firefox & Safari do not fire a "load" event for 404s, but they don't fire an "error" event either.

This is a bug in Firefox; it pretty much never fires "error" on <object>. I was about to fix that... What should I be doing for the 404 case? ;)

@bzbarsky
Copy link
Contributor

I guess the current Firefox behavior is to fire "load" on success, nothing on failure, so you can just wait a bit and then decide the thing failed. Firing "error" won't make this any worse, at least....

@bzbarsky
Copy link
Contributor

since the others require a specific content-type or successful parsing.

<script> requires neither. Simple testcase:

data:text/html,<script onerror="alert(1)" onload="alert(2)" src="data:text/plain,a("></script>

This alerts 2 in browsers.

@annevk
Copy link
Member

annevk commented Apr 29, 2016

I know @mikewest changed Chrome to fail <script> fetching when the response's MIME type was valid and starts with image/. (And you cannot use <img> as <img> doesn't care about HTTP status codes of the final response.)

@bzbarsky
Copy link
Contributor

We should think about what <link rel="prefetch"> should do here too.

@annevk
Copy link
Member

annevk commented May 2, 2016

So yeah, it seems that <script> can be reliably used in most browsers to determine 2xx. Chrome has an exception for resources with a Content-Type that starts with image/*. X-Content-Type-Options can be used by a site to opt-out of this behavior too.

@jakearchibald
Copy link
Collaborator

Huh, do browser fire the load event even if the script doesn't parse?

@annevk
Copy link
Member

annevk commented May 2, 2016

Yeah, though window.onerror will run in that case too (with a script error of sorts).

@domenic
Copy link
Member

domenic commented May 2, 2016

So, right now the Fetch integration happens through https://html.spec.whatwg.org/multipage/semantics.html#concept-link-obtain, which is right above the quoted text. But that also seems like it should be superceded, at least in part, by #968 (CSSOM's "fetch a stylesheet" algorithm).

I'm not sure what to do about CSSOM integration, but I think the correct fix for this bug and for #1148 is the following:

  1. Remove "The semantics of the protocol used (e.g. HTTP) must be followed when fetching external resources. (For example, redirects will be followed and 404 responses will cause the external resource to not be applied.)" and replace it with an error branch that says something like "not-ok responses should result in the resource not being applied."
  2. "Once the attempts to obtain the resource and its critical subresources are complete, the user agent must, if the loads were successful..." (i.e., fire load and error events) should become steps in the algorithm after the Fetch step.
  3. Figure out whether we can fire error events, and if so, for which link types, and if not, change the spec to stop saying we can.
  4. Clarify that we expect "obtain the resource" to be followed for all "external resource links".
  5. Change "The element must delay the load event of the element's node document..." to some kind of flag (like body-ok) which is per-link-rel. Resource Hints specifically says that it should not delay the load event, while other external resources should. https://w3c.github.io/resource-hints/#load-and-error-events

So currently I think we are blocked on, as @jakearchibald says, the Security Avengers telling us about whether error events are a good idea or not.

@domenic domenic changed the title Weird comment about fetching link elements Clean up <link>'s "obtain a resource" Fetch and Resource Hints integration May 2, 2016
@annevk
Copy link
Member

annevk commented May 3, 2016

If someone has bandwidth, we could figure out what happens for rel=stylesheet today.

@bzbarsky
Copy link
Contributor

bzbarsky commented May 3, 2016

In Firefox, <link rel="stylesheet"> has the following behavior:

  1. If the network load failed entirely (connection reset, blocked by adblocker, etc), fire error.
  2. If the HTTP response was not a 2xx response, fire error.
  3. If the response has a Content-Type header and the type is not empty string or "text/css" (ignoring type parameters), then fire error if either the request was cross-origin or if the requesting document is in standards mode.
  4. If subresource integrity checks fail, fire error.
  5. Otherwise fire load.

There are also some complications in terms of what happens with @import rules. Looking at the code, it looks like we defer firing load/error until the last @import completes and then fire load or error based on whether that one @import would have fired load or error. That seems buggy to me, obviously.

@bzbarsky
Copy link
Contributor

bzbarsky commented May 4, 2016

Note that I'm rather keenly interested in knowing exactly what Chrome does here for <link rel="prefetch"> and <object>: I would like to implement sane load/error events for those in Gecko sometime last week.

@bzbarsky
Copy link
Contributor

bzbarsky commented May 4, 2016

@mikewest Do you know who would know about the Chrome behavior details here?

@bzbarsky
Copy link
Contributor

A comment on the <object> thing above: you can tell whether fallback happened or not based on geometry. So you don't need to worry about events at all, as long as you're willing to wait "long enough".

@mikewest
Copy link
Member

mikewest commented Aug 4, 2016

@bzbarsky: I'd bother @igrigorik or @yoavweiss about anything and everything <link>.

For <object>, I doubt we (and by extension WebKit) do anything sane. I think it's reasonable to trigger a load/error event. I think we'd need to do a ton of work to ensure that we never revealed an ok vs. not-ok response code, and could only do so if we made the platform worse (as you note with object's/img's fallback). Ideally, we wouldn't expose that, but I think the cat's out of the box at this point.

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 4, 2016

@mikewest Yeah, I gave up on ever hearing back from you guys and we just went ahead and implemented something. To whit:

  • Gecko just fires load/error events on <object> without worrying about it leaking any information.
  • For <link rel=prefetch> we will (it's about to land) fire a load event except in the following cases:
  1. At the point when we would consider firing the event, the <link> element is not in a DOM tree rooted at a Document (modulo webcomponenty bits; this is using the "in document" concept relevant to a web components world). In this case we fire no events at all.
  2. The load errors out (due to CSP checks or whatnot) before we ever try to contact the server. I have no idea how to spec this part, by the way, because iirc in spec terms this just leads to an error response with no indication of when/why the error happened. In this case we fire an error event.
  3. The request's response tainting is "basic" or "cors", in the "cors" case the CORS checks pass, and the response's HTTP status is not 2xx. In this case we fire an error event.

@annevk
Copy link
Member

annevk commented Sep 4, 2017

To be clear, <object> loading above is only <object data> loading, right? No navigation.

And how is the <link> behavior different from firing an error event for network errors except if the element is not in a document or if the response is not opaque and has a status that's not ok? Would you not fire an error event for a normal network error, when you cannot contact the server?

@domenic
Copy link
Member

domenic commented Dec 24, 2020

This seems significantly better nowadays. Here is my summary of what remains. We may want to close this issue and split it into other more focused issues.

/cc @domfarolino

@andreubotella
Copy link
Member

andreubotella commented Dec 26, 2020

@andreubotella investigated this in #6122 and #6156, but I'm not so sure about his conclusion for preload/resource hints. Compare to, e.g. https://w3c.github.io/resource-hints/#load-and-error-events .

I now think I dropped the ball on preload, since it fulfills all of the conditions I listed in #6122 and the spec expects the load and error events to be fired in example 5 – but with the default "process the linked resource" algorithm now being a noop, they wouldn't fire. That was an overlook on my part, rather than a deliberate decision based on tests. But as for the resource hints, since their processing model is fairly disconnected from the HTML external resource link infrastructure, I don't think "process the linked resource" applies anyway.

In any case, if the load or error events are found to be necessary, the preload spec should need a custom "process the linked resource" algorithm, and the resource hints spec should probably have to change the "should" to a "must" and flesh that out IMO.

@hiroshige-g
Copy link
Contributor

https://html.spec.whatwg.org/multipage/links.html#link-type-preload
now contains the fetch and process the linked resource steps for preloads, and it always fire load events.

However, https://wpt.fyi/results/preload/onerror-event.html expects error events are fired for non-existent resources, and Chrome/Firefox/Safari do so, except for media-related types (perhaps just not supported?) and as=fetch.

Perhaps this WPT/browsers behavior is reasonable for scripts (where main requests to the same URL would also fire error events due to HTTP Status 404) and others.

But for as=fetch I feel it might be reasonable to fire load events for 404 HTTP responses, because XHR/fetch would be successful for HTTP 404 responses (fires xhr.onload etc.). The current browser behavior for <link rel=preload as=fetch href=response-with-404> is:

@annevk
Copy link
Member

annevk commented Mar 22, 2022

Hmm, if as=x that drastically impacts behavior the setup in the specification seems wrong. Would you also expect that for image it does not error for a 404 as it does for the img element?

cc @noamr

@noamr
Copy link
Contributor

noamr commented Mar 22, 2022

Hmm, if as=x that drastically impacts behavior the setup in the specification seems wrong. Would you also expect that for image it does not error for a 404 as it does for the img element?

cc @noamr

Yes, I believe I covered this in WPT. Preloads don't handle 404 especially

@hiroshige-g
Copy link
Contributor

Yeah the preload spec doesn't handle 404 especially, but main requests do handle 404 differently depending on the types:

More generally, the question might be: in different words, what would be the condition of the error event of <link rel=preload>?

  1. Never
  2. For a network error
  3. For a network error or a response with a non-ok status
  4. For a response that would fire the main request's error event
  5. Or something else/inbetween
  • classic <script>: 4 == 3 in the spec, but 4 > 3 in Chromium impl (a response with an invalid MIME type fires <script> onerror)
  • <img>: 4 != 3 both in the spec and the browser impls (a corrupted image fires <img> onerror, while 404 HTTP status doesn't fire ` onerror)
  • <link rel=preload> in the current Chromium behavior is something between 3 and 4 (it basically follows 4, but not always).

Note: while Chromium fires <link rel=preload> error events for non-existent images in https://wpt.fyi/results/preload/onerror-event.html, probably this is because the image content is corrupted rather than the 404 status.

@annevk
Copy link
Member

annevk commented Mar 24, 2022

2 or exactly matching the main request (though would this make the as attribute mandatory?) are the only reasonable options I think from an information leakage perspective. And exactly matching the main request seems pretty tricky. At least the current architecture isn't suited for that.

@noamr
Copy link
Contributor

noamr commented Mar 24, 2022

2 or exactly matching the main request (though would this make the as attribute mandatory?) are the only reasonable options I think from an information leakage perspective. And exactly matching the main request seems pretty tricky. At least the current architecture isn't suited for that.

(2) makes the most sense to me. I don't think preloads should deal with image decoding errors... I'll investigate if this is WPTed/interoperable.

@hiroshige-g
Copy link
Contributor

Thanks for your inputs!
(2) seems fine so far (also fine with @yoavweiss according to a chat).

As for as=fetch cases, the current behavior of the browsers and needed changes to align with (2) are:

  • HTTP 404:
    • Safari: load
    • Chromium/Firefox: error =>load
  • non-404 CORS-failed responses:
    • Firefox:load =>error
    • Safari/Chromium=error
  • non-404 CSP-failed responses: all browsers=error

WPT: web-platform-tests/wpt#33382

Chromium implementation is easy and probably we need this change in order to fix https://bugs.chromium.org/p/chromium/issues/detail?id=1305317.

As the current behavior for the cases to be fixed is already not interoperable, I hope the compatibility risk is not high.

As for other 404 cases (i.e. changing from (3) to (2)), the complexity of the Chromium implementation change (if needed) is medium. Not sure about the compatibility risk though, because I haven't investigated the current behavior.

As for non-404 cases where (4)/(5) is different from (2)/(3) (e.g. image decode errors), I expect the Chromium implementation change would be much larger.

@nyaxt
Copy link
Member

nyaxt commented Mar 28, 2022

I'm OK with (2) to maximize interoperability + privacy.
I'd be curious to know the error event use-cases though. Are the current users happy with (2)? or are they interested in other forms of errors?

@hiroshige-g
Copy link
Contributor

hiroshige-g commented Apr 7, 2022

Updated WPT web-platform-tests/wpt#33382
The expected behavior of spec PR #7799 and actual browser behavior deviating from the spec PR are:

HTTP 404:

CORS Failures

  • load events for <link rel=preload as=image/style/script/fetch crossorigin=anonymous> (Firefox)

Image decode error:

Stylesheets with non-text/css MIME Type (without nosniff):

Classic scripts with non-JavaScript MIME type (without nosniff, listed in https://fetch.spec.whatwg.org/#should-response-to-request-be-blocked-due-to-mime-type?):

(let me know if I'm missing something, e.g. as for understanding the specs)

@hiroshige-g
Copy link
Contributor

While people here basically agreed on (2) in #1142 (comment), I'm leaning toward leaving this issue pending for now, because I'm not so sure about the risks and implementability: I feel some of the behavior changes in #1142 (comment) are not so small, particularly firing load events instead of error events on Firefox/Safari/Chrome for

  • HTTP status 404 for scripts and stylesheets
  • Broken images

and I don't have sufficient bandwidth or motivating dependencies right now for e.g. further investigating the uses and impacts of these events in the wild. Also changing the events for broken images might have nontrivial implementation work in Chromium.

On the other hand, I'd like to land the spec PR #7799 (perhaps with a note that the issue is still open) and its WPT (perhaps as tentative), to anyway monitor the behavior, as the current spec is anyway further deviating from the current browsers' behavior.

Does this plan look good?
Or does anyone have stronger opinions/motivations around this issue?

domenic pushed a commit that referenced this issue Apr 25, 2022
A part of #1142. Although no browsers match this exactly, and implementers aren't sure this will be implementable, this is at least closer to reality than the previous spec, and reflects the direction everyone wants to go.
aarongable pushed a commit to chromium/chromium that referenced this issue Apr 27, 2022
Previously, when XHR/fetch API reuses a
<link rel=preload as=fetch> response with HTTP 4xx,
XHR onabort was fired and fetch API's text() was stalled
(never resolved nor rejected),
due to loading cancellation,
while XHR/fetch API should be successful because
HTTP 4xx isn't considered as a network error in the spec.

This CL fixes this by not considering HTTP 4xx as an error.

As a side effect, this CL fires load events instead of
error events for <link rel=preload as=fetch> + HTTP 4xx.
While the desired behavior about the events on
`<link rel=preload as=fetch>` are still under
discussion at whatwg/html#1142,
changing the event on Chromium is probably fine,
because Firefox/Safari fire different events
(load/error, respectively) for HTTP 4xx.

Original attempt:
https://chromium-review.googlesource.com/c/chromium/src/+/3540965

Bug: 1305317, 1318618
Change-Id: Ic8ec587599f4db28837ed3c1b19b2a5a58e1a4c4
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3585999
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Cr-Commit-Position: refs/heads/main@{#996524}
mfreed7 pushed a commit to mfreed7/html that referenced this issue Jun 3, 2022
A part of whatwg#1142. Although no browsers match this exactly, and implementers aren't sure this will be implementable, this is at least closer to reality than the previous spec, and reflects the direction everyone wants to go.
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this issue Oct 14, 2022
Previously, when XHR/fetch API reuses a
<link rel=preload as=fetch> response with HTTP 4xx,
XHR onabort was fired and fetch API's text() was stalled
(never resolved nor rejected),
due to loading cancellation,
while XHR/fetch API should be successful because
HTTP 4xx isn't considered as a network error in the spec.

This CL fixes this by not considering HTTP 4xx as an error.

As a side effect, this CL fires load events instead of
error events for <link rel=preload as=fetch> + HTTP 4xx.
While the desired behavior about the events on
`<link rel=preload as=fetch>` are still under
discussion at whatwg/html#1142,
changing the event on Chromium is probably fine,
because Firefox/Safari fire different events
(load/error, respectively) for HTTP 4xx.

Original attempt:
https://chromium-review.googlesource.com/c/chromium/src/+/3540965

Bug: 1305317, 1318618
Change-Id: Ic8ec587599f4db28837ed3c1b19b2a5a58e1a4c4
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3585999
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Cr-Commit-Position: refs/heads/main@{#996524}
NOKEYCHECK=True
GitOrigin-RevId: 2feeb37f75b06beea1bafa843a11f9199ffb3739
obyknovenius added a commit to obyknovenius/wpt that referenced this issue Apr 27, 2023
According to whatwg/html#1142 we expect `onerror` event of
`<link rel=preload>` to be fired only on network errors and not on `404`.
Ms2ger pushed a commit to web-platform-tests/wpt that referenced this issue Apr 28, 2023
According to whatwg/html#1142 we expect `onerror` event of
`<link rel=preload>` to be fired only on network errors and not on `404`.
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue May 22, 2023
…on network error, a=testonly

Automatic update from web-platform-tests
Make preload links fire `onerror` event on network error

According to whatwg/html#1142 we expect `onerror` event of
`<link rel=preload>` to be fired only on network errors and not on `404`.

--

wpt-commits: 408e27341a3bdbbdaa155bf21f62e9d3a8e26920
wpt-pr: 39727
ErichDonGubler pushed a commit to ErichDonGubler/firefox that referenced this issue May 23, 2023
…on network error, a=testonly

Automatic update from web-platform-tests
Make preload links fire `onerror` event on network error

According to whatwg/html#1142 we expect `onerror` event of
`<link rel=preload>` to be fired only on network errors and not on `404`.

--

wpt-commits: 408e27341a3bdbbdaa155bf21f62e9d3a8e26920
wpt-pr: 39727
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

10 participants