Merge branch 'yt-dlp:master' into master

This commit is contained in:
wesson09 2026-01-20 13:42:24 +01:00 committed by GitHub
commit 58b3eaffca
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
16 changed files with 307 additions and 223 deletions

View file

@ -335,7 +335,7 @@ jobs:
# We need to fuse our own universal2 wheels for curl_cffi
python3 -m pip install -U 'delocate==0.11.0'
mkdir curl_cffi_whls curl_cffi_universal2
python3 devscripts/install_deps.py --print --omit-default --include-extra curl-cffi > requirements.txt
python3 devscripts/install_deps.py --print --omit-default --include-extra build-curl-cffi > requirements.txt
for platform in "macosx_11_0_arm64" "macosx_11_0_x86_64"; do
python3 -m pip download \
--only-binary=:all: \
@ -464,7 +464,7 @@ jobs:
if ("${Env:ARCH}" -eq "x86") {
python devscripts/install_deps.py
} else {
python devscripts/install_deps.py --include-extra curl-cffi
python devscripts/install_deps.py --include-extra build-curl-cffi
}
- name: Prepare

View file

@ -9,7 +9,7 @@ permissions: {}
jobs:
check_nightly:
name: Check for new commits
if: vars.BUILD_NIGHTLY
if: github.event_name == 'workflow_dispatch' || vars.BUILD_NIGHTLY
permissions:
contents: read
runs-on: ubuntu-latest

View file

@ -1646,6 +1646,8 @@ Note that the default for hdr is `hdr:12`; i.e. Dolby Vision is not preferred. T
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
If you use the `-S`/`--format-sort` option multiple times, each subsequent sorting argument will be prepended to the previous one, and only the highest priority entry of any duplicated field will be preserved. E.g. `-S proto -S res` is equivalent to `-S res,proto`, and `-S res:720,fps -S vcodec,res:1080` is equivalent to `-S vcodec,res:1080,fps`. You can use `--format-sort-reset` to disregard any previously passed `-S`/`--format-sort` arguments and reset to the default order.
**Tip**: You can use the `-v -F` to see how the formats have been sorted (worst to best).
## Format Selection examples
@ -1857,7 +1859,7 @@ The following extractors use this feature:
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube/_base.py](https://github.com/yt-dlp/yt-dlp/blob/415b4c9f955b1a0391204bd24a7132590e7b3bdb/yt_dlp/extractor/youtube/_base.py#L402-L409) for the list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_sdkless`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `tv,android_sdkless,web` is used. If no JavaScript runtime/engine is available, then `android_sdkless,web_safari,web` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web_safari,web` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_sdkless`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `android_sdkless,web,web_safari` is used. If no JavaScript runtime/engine is available, then only `android_sdkless` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web,web_safari` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `webpage_skip`: Skip extraction of embedded webpage data. One or both of `player_response`, `initial_data`. These options are for testing purposes and don't skip any network requests
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
@ -2349,7 +2351,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Passing `--simulate` (or calling `extract_info` with `download=False`) no longer alters the default format selection. See [#9843](https://github.com/yt-dlp/yt-dlp/issues/9843) for details.
* yt-dlp no longer applies the server modified time to downloaded files by default. Use `--mtime` or `--compat-options mtime-by-default` to revert this.
For ease of use, a few more compat options are available:
For convenience, there are some compat option aliases available to use:
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
@ -2357,7 +2359,10 @@ For ease of use, a few more compat options are available:
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
* `--compat-options 2024`: Same as `--compat-options mtime-by-default`. Use this to enable all future compat options
* `--compat-options 2024`: Same as `--compat-options 2025,mtime-by-default`
* `--compat-options 2025`: Currently does nothing. Use this to enable all future compat options
Using one of the yearly compat option aliases will pin yt-dlp's default behavior to what it was at the *end* of that calendar year.
The following compat options restore vulnerable behavior from before security patches:

View file

@ -20,7 +20,7 @@ INCLUDES=(
)
if [[ -z "${EXCLUDE_CURL_CFFI:-}" ]]; then
INCLUDES+=(--include-extra curl-cffi)
INCLUDES+=(--include-extra build-curl-cffi)
fi
runpy -m venv /yt-dlp-build-venv

View file

@ -59,12 +59,19 @@ default = [
"yt-dlp-ejs==0.3.2",
]
curl-cffi = [
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.14; implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.15; implementation_name=='cpython'",
]
build-curl-cffi = [
"curl-cffi==0.13.0; sys_platform=='darwin' or (sys_platform=='linux' and platform_machine!='armv7l')",
"curl-cffi==0.14.0; sys_platform=='win32' or (sys_platform=='linux' and platform_machine=='armv7l')",
]
secretstorage = [
"cffi",
"secretstorage",
]
deno = [
"deno>=2.6.5", # v2.6.5 fixes installation of incompatible binaries
]
build = [
"build",
"hatchling>=1.27.0",

View file

@ -227,9 +227,13 @@ class TestDevalue(unittest.TestCase):
{'a': 'b'}, 'revivers (indirect)')
self.assertEqual(
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': lambda x: json.loads(x)}),
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': json.loads}),
{'a': 0}, 'revivers (parse)')
self.assertEqual(
devalue.parse([{'a': 1, 'b': 3}, ['EmptyRef', 2], 'false', ['EmptyRef', 2]], revivers={'EmptyRef': json.loads}),
{'a': False, 'b': False}, msg='revivers (duplicate EmptyRef)')
if __name__ == '__main__':
unittest.main()

View file

@ -1,32 +1,4 @@
# flake8: noqa: F401
# isort: off
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
YoutubeShortsAudioPivotIE,
YoutubeConsentRedirectIE,
)
# isort: on
from .abc import (
ABCIE,
ABCIViewIE,
@ -2551,6 +2523,29 @@ from .youporn import (
YouPornTagIE,
YouPornVideosIE,
)
from .youtube import (
YoutubeClipIE,
YoutubeConsentRedirectIE,
YoutubeFavouritesIE,
YoutubeHistoryIE,
YoutubeIE,
YoutubeLivestreamEmbedIE,
YoutubeMusicSearchURLIE,
YoutubeNotificationsIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeShortsAudioPivotIE,
YoutubeSubscriptionsIE,
YoutubeTabIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeWatchLaterIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
)
from .zaiko import (
ZaikoETicketIE,
ZaikoIE,

View file

@ -105,7 +105,7 @@ class CBCIE(InfoExtractor):
# multiple CBC.APP.Caffeine.initInstance(...)
'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks', # FIXME: actual title includes " | CBC News"
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238',
'description': 'md5:c18552e41726ee95bd75210d1ca9194c',
},
@ -134,6 +134,13 @@ class CBCIE(InfoExtractor):
title = (self._og_search_title(webpage, default=None)
or self._html_search_meta('twitter:title', webpage, 'title', default=None)
or self._html_extract_title(webpage))
title = self._search_regex(
r'^(?P<title>.+?)(?:\s*[|-]\s*CBC.*)?$',
title, 'cleaned title', group='title', default=title)
data = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage,
'initial state', display_id, default={}, transform_source=js_to_json)
entries = [
self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
@ -143,6 +150,11 @@ class CBCIE(InfoExtractor):
r'<div[^>]+\bid=["\']player-(\d+)',
r'guid["\']\s*:\s*["\'](\d+)'):
media_ids.extend(re.findall(media_id_re, webpage))
media_ids.extend(traverse_obj(data, (
'detail', 'content', 'body', ..., 'content',
lambda _, v: v['type'] == 'polopoly_media', 'content', 'sourceId', {str})))
if content_id := traverse_obj(data, ('app', 'contentId', {str})):
media_ids.append(content_id)
entries.extend([
self.url_result(f'cbcplayer:{media_id}', 'CBCPlayer', media_id)
for media_id in orderedSet(media_ids)])
@ -268,7 +280,7 @@ class CBCPlayerIE(InfoExtractor):
'duration': 2692.833,
'subtitles': {
'en-US': [{
'name': 'English Captions',
'name': r're:English',
'url': 'https://cbchls.akamaized.net/delivery/news-shows/2024/06/17/NAT_JUN16-00-55-00/NAT_JUN16_cc.vtt',
}],
},
@ -322,6 +334,7 @@ class CBCPlayerIE(InfoExtractor):
'categories': ['Olympics Summer Soccer', 'Summer Olympics Replays', 'Summer Olympics Soccer Replays'],
'location': 'Canada',
},
'skip': 'Video no longer available',
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.cbc.ca/player/play/video/9.6459530',
@ -380,7 +393,8 @@ class CBCPlayerIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(f'https://www.cbc.ca/player/play/{video_id}', video_id)
data = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)['video']['currentClip']
r'window\.__INITIAL_STATE__\s*=', webpage,
'initial state', video_id, transform_source=js_to_json)['video']['currentClip']
assets = traverse_obj(
data, ('media', 'assets', lambda _, v: url_or_none(v['key']) and v['type']))
@ -492,12 +506,14 @@ class CBCPlayerPlaylistIE(InfoExtractor):
'info_dict': {
'id': 'news/tv shows/the national/latest broadcast',
},
'skip': 'Playlist no longer available',
}, {
'url': 'https://www.cbc.ca/player/news/Canada/North',
'playlist_mincount': 25,
'info_dict': {
'id': 'news/canada/north',
},
'skip': 'Playlist no longer available',
}]
def _real_extract(self, url):

View file

@ -18,23 +18,41 @@ class CCCIE(InfoExtractor):
'id': '1839',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'creators': ['byterazor'],
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
'tags': list,
'display_id': '30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor',
'view_count': int,
},
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
'only_matching': True,
}, {
'url': 'https://media.ccc.de/v/39c3-schlechte-karten-it-sicherheit-im-jahr-null-der-epa-fur-alle',
'info_dict': {
'id': '16261',
'ext': 'mp4',
'title': 'Schlechte Karten - IT-Sicherheit im Jahr null der ePA für alle',
'display_id': '39c3-schlechte-karten-it-sicherheit-im-jahr-null-der-epa-fur-alle',
'description': 'md5:719a5a9a52630249d606219c55056cbf',
'view_count': int,
'duration': 3619,
'thumbnail': 'https://static.media.ccc.de/media/congress/2025/2403-2b5a6a8e-327e-594d-8f92-b91201d18a02.jpg',
'tags': list,
'creators': ['Bianca Kastl'],
'timestamp': 1767024900,
'upload_date': '20251229',
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
event_id = self._search_regex(r"data-id=(['\"])(?P<event_id>\d+)\1", webpage, 'event id', group='event_id')
event_data = self._download_json(f'https://media.ccc.de/public/events/{event_id}', event_id)
formats = []

View file

@ -1,4 +1,4 @@
import inspect
import itertools
import os
from ..globals import LAZY_EXTRACTORS
@ -17,12 +17,18 @@ else:
if not _CLASS_LOOKUP:
from . import _extractors
_CLASS_LOOKUP = {
name: value
for name, value in inspect.getmembers(_extractors)
if name.endswith('IE') and name != 'GenericIE'
}
_CLASS_LOOKUP['GenericIE'] = _extractors.GenericIE
members = tuple(
(name, getattr(_extractors, name))
for name in dir(_extractors)
if name.endswith('IE')
)
_CLASS_LOOKUP = dict(itertools.chain(
# Add Youtube first to improve matching performance
((name, value) for name, value in members if '.youtube' in value.__module__),
# Add Generic last so that it is the fallback
((name, value) for name, value in members if name != 'GenericIE'),
(('GenericIE', _extractors.GenericIE),),
))
# We want to append to the main lookup
_current = _extractors_context.value

View file

@ -99,7 +99,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20250925.01.00',
'clientVersion': '2.20260114.08.00',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
@ -112,7 +112,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20250925.01.00',
'clientVersion': '2.20260114.08.00',
'userAgent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15,gzip(gfe)',
},
},
@ -125,7 +125,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'WEB_EMBEDDED_PLAYER',
'clientVersion': '1.20250923.21.00',
'clientVersion': '1.20260115.01.00',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 56,
@ -136,7 +136,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'WEB_REMIX',
'clientVersion': '1.20250922.03.00',
'clientVersion': '1.20260114.03.00',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 67,
@ -166,7 +166,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'WEB_CREATOR',
'clientVersion': '1.20250922.03.00',
'clientVersion': '1.20260114.05.00',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 62,
@ -195,9 +195,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID',
'clientVersion': '20.10.38',
'clientVersion': '21.02.35',
'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/20.10.38 (Linux; U; Android 11) gzip',
'userAgent': 'com.google.android.youtube/21.02.35 (Linux; U; Android 11) gzip',
'osName': 'Android',
'osVersion': '11',
},
@ -228,8 +228,8 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID',
'clientVersion': '20.10.38',
'userAgent': 'com.google.android.youtube/20.10.38 (Linux; U; Android 11) gzip',
'clientVersion': '21.02.35',
'userAgent': 'com.google.android.youtube/21.02.35 (Linux; U; Android 11) gzip',
'osName': 'Android',
'osVersion': '11',
},
@ -242,11 +242,11 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID_VR',
'clientVersion': '1.65.10',
'clientVersion': '1.71.26',
'deviceMake': 'Oculus',
'deviceModel': 'Quest 3',
'androidSdkVersion': 32,
'userAgent': 'com.google.android.apps.youtube.vr.oculus/1.65.10 (Linux; U; Android 12L; eureka-user Build/SQ3A.220605.009.A1) gzip',
'userAgent': 'com.google.android.apps.youtube.vr.oculus/1.71.26 (Linux; U; Android 12L; eureka-user Build/SQ3A.220605.009.A1) gzip',
'osName': 'Android',
'osVersion': '12L',
},
@ -260,10 +260,10 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'IOS',
'clientVersion': '20.10.4',
'clientVersion': '21.02.3',
'deviceMake': 'Apple',
'deviceModel': 'iPhone16,2',
'userAgent': 'com.google.ios.youtube/20.10.4 (iPhone16,2; U; CPU iOS 18_3_2 like Mac OS X;)',
'userAgent': 'com.google.ios.youtube/21.02.3 (iPhone16,2; U; CPU iOS 18_3_2 like Mac OS X;)',
'osName': 'iPhone',
'osVersion': '18.3.2.22D82',
},
@ -291,7 +291,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'MWEB',
'clientVersion': '2.20250925.01.00',
'clientVersion': '2.20260115.01.00',
# mweb previously did not require PO Token with this UA
'userAgent': 'Mozilla/5.0 (iPad; CPU OS 16_7_10 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1,gzip(gfe)',
},
@ -322,7 +322,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'TVHTML5',
'clientVersion': '7.20250923.13.00',
'clientVersion': '7.20260114.12.00',
'userAgent': 'Mozilla/5.0 (ChromiumStylePlatform) Cobalt/Version',
},
},
@ -335,7 +335,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'TVHTML5',
'clientVersion': '5.20251105',
'clientVersion': '5.20260114',
'userAgent': 'Mozilla/5.0 (ChromiumStylePlatform) Cobalt/Version',
},
},

View file

@ -10,7 +10,6 @@ import re
import sys
import threading
import time
import traceback
import urllib.parse
from ._base import (
@ -63,6 +62,7 @@ from ...utils import (
unescapeHTML,
unified_strdate,
unsmuggle_url,
update_url,
update_url_query,
url_or_none,
urljoin,
@ -145,9 +145,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
r'\b(?P<id>vfl[a-zA-Z0-9_-]+)\b.*?\.js$',
)
_SUBTITLE_FORMATS = ('json3', 'srv1', 'srv2', 'srv3', 'ttml', 'srt', 'vtt')
_DEFAULT_CLIENTS = ('tv', 'android_sdkless', 'web')
_DEFAULT_JSLESS_CLIENTS = ('android_sdkless', 'web_safari', 'web')
_DEFAULT_AUTHED_CLIENTS = ('tv_downgraded', 'web_safari', 'web')
_DEFAULT_CLIENTS = ('android_sdkless', 'web', 'web_safari')
_DEFAULT_JSLESS_CLIENTS = ('android_sdkless',)
_DEFAULT_AUTHED_CLIENTS = ('tv_downgraded', 'web', 'web_safari')
# Premium does not require POT (except for subtitles)
_DEFAULT_PREMIUM_CLIENTS = ('tv_downgraded', 'web_creator', 'web')
@ -2193,64 +2193,32 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self._code_cache[player_js_key] = code
return self._code_cache.get(player_js_key)
def _sig_spec_cache_id(self, player_url, spec_id):
return join_nonempty(self._player_js_cache_key(player_url), str(spec_id))
def _load_player_data_from_cache(self, name, player_url, *cache_keys, use_disk_cache=False):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url), *map(str_or_none, cache_keys))
if cache_id in self._player_cache:
return self._player_cache[cache_id]
def _load_sig_spec_from_cache(self, spec_cache_id):
# This is almost identical to _load_player_data_from_cache
# I hate it
if spec_cache_id in self._player_cache:
return self._player_cache[spec_cache_id]
spec = self.cache.load('youtube-sigfuncs', spec_cache_id, min_ver='2025.07.21')
if spec:
self._player_cache[spec_cache_id] = spec
return spec
if not use_disk_cache:
return None
def _store_sig_spec_to_cache(self, spec_cache_id, spec):
if spec_cache_id not in self._player_cache:
self._player_cache[spec_cache_id] = spec
self.cache.store('youtube-sigfuncs', spec_cache_id, spec)
def _load_player_data_from_cache(self, name, player_url):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url))
if data := self._player_cache.get(cache_id):
return data
data = self.cache.load(*cache_id, min_ver='2025.07.21')
data = self.cache.load(cache_id[0], join_nonempty(*cache_id[1:]), min_ver='2025.07.21')
if data:
self._player_cache[cache_id] = data
return data
def _cached(self, func, *cache_id):
def inner(*args, **kwargs):
if cache_id not in self._player_cache:
try:
self._player_cache[cache_id] = func(*args, **kwargs)
except ExtractorError as e:
self._player_cache[cache_id] = e
except Exception as e:
self._player_cache[cache_id] = ExtractorError(traceback.format_exc(), cause=e)
ret = self._player_cache[cache_id]
if isinstance(ret, Exception):
raise ret
return ret
return inner
def _store_player_data_to_cache(self, name, player_url, data):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url))
def _store_player_data_to_cache(self, data, name, player_url, *cache_keys, use_disk_cache=False):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url), *map(str_or_none, cache_keys))
if cache_id not in self._player_cache:
self.cache.store(*cache_id, data)
self._player_cache[cache_id] = data
if use_disk_cache:
self.cache.store(cache_id[0], join_nonempty(*cache_id[1:]), data)
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
"""
Extract signatureTimestamp (sts)
Required to tell API what sig/player version is in use.
"""
CACHE_ENABLED = False # TODO: enable when preprocessed player JS cache is solved/enabled
player_sts_override = self._get_player_js_version()[0]
if player_sts_override:
@ -2267,15 +2235,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self.report_warning(error_msg)
return None
if CACHE_ENABLED and (sts := self._load_player_data_from_cache('sts', player_url)):
# TODO: Pass `use_disk_cache=True` when preprocessed player JS cache is solved
if sts := self._load_player_data_from_cache('sts', player_url):
return sts
if code := self._load_player(video_id, player_url, fatal=fatal):
sts = int_or_none(self._search_regex(
r'(?:signatureTimestamp|sts)\s*:\s*(?P<sts>[0-9]{5})', code,
'JS player signature timestamp', group='sts', fatal=fatal))
if CACHE_ENABLED and sts:
self._store_player_data_to_cache('sts', player_url, sts)
if sts:
# TODO: Pass `use_disk_cache=True` when preprocessed player JS cache is solved
self._store_player_data_to_cache(sts, 'sts', player_url)
return sts
@ -2793,7 +2763,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'WEB_PLAYER_CONTEXT_CONFIGS', ..., 'serializedExperimentFlags', {urllib.parse.parse_qs}))
if 'true' in traverse_obj(experiments, (..., 'html5_generate_content_po_token', -1)):
self.write_debug(
f'{video_id}: Detected experiment to bind GVS PO Token to video id.', only_once=True)
f'{video_id}: Detected experiment to bind GVS PO Token '
f'to video ID for {client} client', only_once=True)
gvs_bind_to_video_id = True
# GVS WebPO Token is bound to visitor_data / Visitor ID when logged out.
@ -3233,6 +3204,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'audio_quality_ultralow', 'audio_quality_low', 'audio_quality_medium', 'audio_quality_high', # Audio only formats
'small', 'medium', 'large', 'hd720', 'hd1080', 'hd1440', 'hd2160', 'hd2880', 'highres',
])
skip_player_js = 'js' in self._configuration_arg('player_skip')
format_types = self._configuration_arg('formats')
all_formats = 'duplicate' in format_types
if self._configuration_arg('include_duplicate_formats'):
@ -3278,6 +3250,98 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return language_code, DEFAULT_LANG_VALUE
return language_code, -1
def get_manifest_n_challenge(manifest_url):
if not url_or_none(manifest_url):
return None
# Same pattern that the player JS uses to read/replace the n challenge value
return self._search_regex(
r'/n/([^/]+)/', urllib.parse.urlparse(manifest_url).path,
'n challenge', default=None)
n_challenges = set()
s_challenges = set()
def solve_js_challenges():
# Solve all n/sig challenges in bulk and store the results in self._player_cache
challenge_requests = []
if n_challenges:
challenge_requests.append(JsChallengeRequest(
type=JsChallengeType.N,
video_id=video_id,
input=NChallengeInput(challenges=list(n_challenges), player_url=player_url)))
if s_challenges:
cached_sigfuncs = set()
for spec_id in s_challenges:
if self._load_player_data_from_cache('sigfuncs', player_url, spec_id, use_disk_cache=True):
cached_sigfuncs.add(spec_id)
s_challenges.difference_update(cached_sigfuncs)
challenge_requests.append(JsChallengeRequest(
type=JsChallengeType.SIG,
video_id=video_id,
input=SigChallengeInput(
challenges=[''.join(map(chr, range(spec_id))) for spec_id in s_challenges],
player_url=player_url)))
if challenge_requests:
for _challenge_request, challenge_response in self._jsc_director.bulk_solve(challenge_requests):
if challenge_response.type == JsChallengeType.SIG:
for challenge, result in challenge_response.output.results.items():
spec_id = len(challenge)
self._store_player_data_to_cache(
[ord(c) for c in result], 'sigfuncs',
player_url, spec_id, use_disk_cache=True)
if spec_id in s_challenges:
s_challenges.remove(spec_id)
elif challenge_response.type == JsChallengeType.N:
for challenge, result in challenge_response.output.results.items():
self._store_player_data_to_cache(result, 'n', player_url, challenge)
if challenge in n_challenges:
n_challenges.remove(challenge)
# Raise warning if any challenge requests remain
# Depending on type of challenge request
help_message = (
'Ensure you have a supported JavaScript runtime and '
'challenge solver script distribution installed. '
'Review any warnings presented before this message. '
f'For more details, refer to {_EJS_WIKI_URL}')
if s_challenges:
self.report_warning(
f'Signature solving failed: Some formats may be missing. {help_message}',
video_id=video_id, only_once=True)
if n_challenges:
self.report_warning(
f'n challenge solving failed: Some formats may be missing. {help_message}',
video_id=video_id, only_once=True)
# Clear challenge sets so that any subsequent call of this function is a no-op
s_challenges.clear()
n_challenges.clear()
# 1st pass to collect all n/sig challenges so they can later be solved at once in bulk
for streaming_data in traverse_obj(player_responses, (..., 'streamingData', {dict})):
# HTTPS formats
for fmt_stream in traverse_obj(streaming_data, (('formats', 'adaptiveFormats'), ..., {dict})):
fmt_url = fmt_stream.get('url')
s_challenge = None
if not fmt_url:
sc = urllib.parse.parse_qs(fmt_stream.get('signatureCipher'))
fmt_url = traverse_obj(sc, ('url', 0, {url_or_none}))
s_challenge = traverse_obj(sc, ('s', 0))
if s_challenge:
s_challenges.add(len(s_challenge))
if n_challenge := traverse_obj(fmt_url, ({parse_qs}, 'n', 0)):
n_challenges.add(n_challenge)
# Manifest formats
n_challenges.update(traverse_obj(
streaming_data, (('hlsManifestUrl', 'dashManifestUrl'), {get_manifest_n_challenge})))
# Final pass to extract formats and solve n/sig challenges as needed
for pr in player_responses:
streaming_data = traverse_obj(pr, 'streamingData')
if not streaming_data:
@ -3385,7 +3449,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def process_https_formats():
proto = 'https'
https_fmts = []
skip_player_js = 'js' in self._configuration_arg('player_skip')
for fmt_stream in streaming_formats:
if fmt_stream.get('targetDurationSec'):
@ -3422,19 +3485,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# See: https://github.com/yt-dlp/yt-dlp/issues/14883
get_language_code_and_preference(fmt_stream)
sc = urllib.parse.parse_qs(fmt_stream.get('signatureCipher'))
fmt_url = url_or_none(try_get(sc, lambda x: x['url'][0]))
encrypted_sig = try_get(sc, lambda x: x['s'][0])
fmt_url = traverse_obj(sc, ('url', 0, {url_or_none}))
encrypted_sig = traverse_obj(sc, ('s', 0))
if not all((sc, fmt_url, skip_player_js or player_url, encrypted_sig)):
msg = f'Some {client_name} client https formats have been skipped as they are missing a URL. '
msg_tmpl = (
'{}Some {} client https formats have been skipped as they are missing a URL. '
'{}. See https://github.com/yt-dlp/yt-dlp/issues/12482 for more details')
if client_name in ('web', 'web_safari'):
msg += 'YouTube is forcing SABR streaming for this client. '
self.write_debug(msg_tmpl.format(
f'{video_id}: ', client_name,
'YouTube is forcing SABR streaming for this client'), only_once=True)
else:
msg += (
msg = (
f'YouTube may have enabled the SABR-only streaming experiment for '
f'{"your account" if self.is_authenticated else "the current session"}. '
)
msg += 'See https://github.com/yt-dlp/yt-dlp/issues/12482 for more details'
self.report_warning(msg, video_id, only_once=True)
f'{"your account" if self.is_authenticated else "the current session"}')
self.report_warning(msg_tmpl.format('', client_name, msg), video_id, only_once=True)
continue
fmt = process_format_stream(
@ -3444,19 +3509,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
continue
# signature
# Attempt to load sig spec from cache
if encrypted_sig:
if skip_player_js:
continue
spec_cache_id = self._sig_spec_cache_id(player_url, len(encrypted_sig))
spec = self._load_sig_spec_from_cache(spec_cache_id)
if spec:
self.write_debug(f'Using cached signature function {spec_cache_id}', only_once=True)
fmt_url += '&{}={}'.format(traverse_obj(sc, ('sp', -1)) or 'signature',
solve_sig(encrypted_sig, spec))
else:
fmt['_jsc_s_challenge'] = encrypted_sig
fmt['_jsc_s_sc'] = sc
solve_js_challenges()
spec = self._load_player_data_from_cache(
'sigfuncs', player_url, len(encrypted_sig), use_disk_cache=True)
if not spec:
continue
fmt_url += '&{}={}'.format(
traverse_obj(sc, ('sp', -1)) or 'signature',
solve_sig(encrypted_sig, spec))
# n challenge
query = parse_qs(fmt_url)
@ -3464,10 +3527,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if skip_player_js:
continue
n_challenge = query['n'][0]
if n_challenge in self._player_cache:
fmt_url = update_url_query(fmt_url, {'n': self._player_cache[n_challenge]})
else:
fmt['_jsc_n_challenge'] = n_challenge
solve_js_challenges()
n_result = self._load_player_data_from_cache('n', player_url, n_challenge)
if not n_result:
continue
fmt_url = update_url_query(fmt_url, {'n': n_result})
if po_token:
fmt_url = update_url_query(fmt_url, {'pot': po_token})
@ -3484,80 +3548,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
https_fmts.append(fmt)
# Bulk process sig/n handling
# Retrieve all JSC Sig and n requests for this player response in one go
n_challenges = {}
s_challenges = {}
for fmt in https_fmts:
# This will de-duplicate requests
n_challenge = fmt.pop('_jsc_n_challenge', None)
if n_challenge is not None:
n_challenges.setdefault(n_challenge, []).append(fmt)
s_challenge = fmt.pop('_jsc_s_challenge', None)
if s_challenge is not None:
s_challenges.setdefault(len(s_challenge), {}).setdefault(s_challenge, []).append(fmt)
challenge_requests = []
if n_challenges:
challenge_requests.append(JsChallengeRequest(
type=JsChallengeType.N,
video_id=video_id,
input=NChallengeInput(challenges=list(n_challenges.keys()), player_url=player_url)))
if s_challenges:
challenge_requests.append(JsChallengeRequest(
type=JsChallengeType.SIG,
video_id=video_id,
input=SigChallengeInput(challenges=[''.join(map(chr, range(spec_id))) for spec_id in s_challenges], player_url=player_url)))
if challenge_requests:
for _challenge_request, challenge_response in self._jsc_director.bulk_solve(challenge_requests):
if challenge_response.type == JsChallengeType.SIG:
for challenge, result in challenge_response.output.results.items():
spec_id = len(challenge)
spec = [ord(c) for c in result]
self._store_sig_spec_to_cache(self._sig_spec_cache_id(player_url, spec_id), spec)
s_challenge_data = s_challenges.pop(spec_id, {})
if not s_challenge_data:
continue
for s_challenge, fmts in s_challenge_data.items():
solved_challenge = solve_sig(s_challenge, spec)
for fmt in fmts:
sc = fmt.pop('_jsc_s_sc')
fmt['url'] += '&{}={}'.format(
traverse_obj(sc, ('sp', -1)) or 'signature',
solved_challenge)
elif challenge_response.type == JsChallengeType.N:
for challenge, result in challenge_response.output.results.items():
fmts = n_challenges.pop(challenge, [])
for fmt in fmts:
self._player_cache[challenge] = result
fmt['url'] = update_url_query(fmt['url'], {'n': result})
# Raise warning if any challenge requests remain
# Depending on type of challenge request
help_message = (
'Ensure you have a supported JavaScript runtime and '
'challenge solver script distribution installed. '
'Review any warnings presented before this message. '
f'For more details, refer to {_EJS_WIKI_URL}')
if s_challenges:
self.report_warning(
f'Signature solving failed: Some formats may be missing. {help_message}',
video_id=video_id, only_once=True)
if n_challenges:
self.report_warning(
f'n challenge solving failed: Some formats may be missing. {help_message}',
video_id=video_id, only_once=True)
for cfmts in list(s_challenges.values()) + list(n_challenges.values()):
for fmt in cfmts:
if fmt in https_fmts:
https_fmts.remove(fmt)
for fmt in https_fmts:
if (all_formats or 'dashy' in format_types) and fmt['filesize']:
yield {
@ -3640,17 +3630,34 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
hls_manifest_url = 'hls' not in skip_manifests and streaming_data.get('hlsManifestUrl')
if hls_manifest_url:
manifest_path = urllib.parse.urlparse(hls_manifest_url).path
if m := re.fullmatch(r'(?P<path>.+)(?P<suffix>/(?:file|playlist)/index\.m3u8)', manifest_path):
manifest_path, manifest_suffix = m.group('path', 'suffix')
else:
manifest_suffix = ''
solved_n = False
n_challenge = get_manifest_n_challenge(hls_manifest_url)
if n_challenge and not skip_player_js:
solve_js_challenges()
n_result = self._load_player_data_from_cache('n', player_url, n_challenge)
if n_result:
manifest_path = manifest_path.replace(f'/n/{n_challenge}', f'/n/{n_result}')
solved_n = n_result in manifest_path
pot_policy: GvsPoTokenPolicy = self._get_default_ytcfg(
client_name)['GVS_PO_TOKEN_POLICY'][StreamingProtocol.HLS]
require_po_token = gvs_pot_required(pot_policy, is_premium_subscriber, player_token_provided)
po_token = gvs_pots.get(client_name, fetch_po_token_func(required=require_po_token or pot_policy.recommended))
if po_token:
hls_manifest_url = hls_manifest_url.rstrip('/') + f'/pot/{po_token}'
manifest_path = manifest_path.rstrip('/') + f'/pot/{po_token}'
if client_name not in gvs_pots:
gvs_pots[client_name] = po_token
if require_po_token and not po_token and 'missing_pot' not in self._configuration_arg('formats'):
self._report_pot_format_skipped(video_id, client_name, 'hls')
else:
elif solved_n or not n_challenge:
hls_manifest_url = update_url(hls_manifest_url, path=f'{manifest_path}{manifest_suffix}')
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_manifest_url, video_id, 'mp4', fatal=False, live=live_status == 'is_live')
for sub in traverse_obj(subs, (..., ..., {dict})):
@ -3665,17 +3672,30 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
dash_manifest_url = 'dash' not in skip_manifests and streaming_data.get('dashManifestUrl')
if dash_manifest_url:
manifest_path = urllib.parse.urlparse(dash_manifest_url).path
solved_n = False
n_challenge = get_manifest_n_challenge(dash_manifest_url)
if n_challenge and not skip_player_js:
solve_js_challenges()
n_result = self._load_player_data_from_cache('n', player_url, n_challenge)
if n_result:
manifest_path = manifest_path.replace(f'/n/{n_challenge}', f'/n/{n_result}')
solved_n = n_result in manifest_path
pot_policy: GvsPoTokenPolicy = self._get_default_ytcfg(
client_name)['GVS_PO_TOKEN_POLICY'][StreamingProtocol.DASH]
require_po_token = gvs_pot_required(pot_policy, is_premium_subscriber, player_token_provided)
po_token = gvs_pots.get(client_name, fetch_po_token_func(required=require_po_token or pot_policy.recommended))
if po_token:
dash_manifest_url = dash_manifest_url.rstrip('/') + f'/pot/{po_token}'
manifest_path = manifest_path.rstrip('/') + f'/pot/{po_token}'
if client_name not in gvs_pots:
gvs_pots[client_name] = po_token
if require_po_token and not po_token and 'missing_pot' not in self._configuration_arg('formats'):
self._report_pot_format_skipped(video_id, client_name, 'dash')
else:
elif solved_n or not n_challenge:
dash_manifest_url = update_url(dash_manifest_url, path=manifest_path)
formats, subs = self._extract_mpd_formats_and_subtitles(dash_manifest_url, video_id, fatal=False)
for sub in traverse_obj(subs, (..., ..., {dict})):
# TODO: If DASH video requires a PO Token, do the subs also require pot?

View file

@ -33,9 +33,9 @@ if curl_cffi is None:
curl_cffi_version = tuple(map(int, re.split(r'[^\d]+', curl_cffi.__version__)[:3]))
if curl_cffi_version != (0, 5, 10) and not (0, 10) <= curl_cffi_version < (0, 14):
if curl_cffi_version != (0, 5, 10) and not (0, 10) <= curl_cffi_version < (0, 15):
curl_cffi._yt_dlp__version = f'{curl_cffi.__version__} (unsupported)'
raise ImportError('Only curl_cffi versions 0.5.10, 0.10.x, 0.11.x, 0.12.x, 0.13.x are supported')
raise ImportError('Only curl_cffi versions 0.5.10 and 0.10.x through 0.14.x are supported')
import curl_cffi.requests
from curl_cffi.const import CurlECode, CurlOpt

View file

@ -578,7 +578,8 @@ def create_parser():
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
'2023': ['2024', 'prefer-vp9-sort'],
'2024': ['mtime-by-default'],
'2024': ['2025', 'mtime-by-default'],
'2025': [],
},
}, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
@ -886,6 +887,10 @@ def create_parser():
dest='format_sort', default=[], type='str', action='callback',
callback=_list_from_options_callback, callback_kwargs={'append': -1},
help='Sort the formats by the fields given, see "Sorting Formats" for more details')
video_format.add_option(
'--format-sort-reset',
dest='format_sort', action='store_const', const=[],
help='Disregard previous user specified sort order and reset to the default')
video_format.add_option(
'--format-sort-force', '--S-force',
action='store_true', dest='format_sort_force', metavar='FORMAT', default=False,

View file

@ -5,6 +5,7 @@ import dataclasses
import functools
import os.path
import sys
import sysconfig
from ._utils import _get_exe_version_output, detect_exe_version, version_tuple
@ -13,6 +14,13 @@ _FALLBACK_PATHEXT = ('.COM', '.EXE', '.BAT', '.CMD')
def _find_exe(basename: str) -> str:
# Check in Python "scripts" path, e.g. for pipx-installed binaries
binary = os.path.join(
sysconfig.get_path('scripts'),
basename + sysconfig.get_config_var('EXE'))
if os.access(binary, os.F_OK | os.X_OK) and not os.path.isdir(binary):
return binary
if os.name != 'nt':
return basename

View file

@ -62,10 +62,10 @@ def parse_iter(parsed: typing.Any, /, *, revivers: dict[str, collections.abc.Cal
if isinstance(source, tuple):
name, source, reviver = source
try:
resolved[source] = target[index] = reviver(target[index])
target[index] = reviver(target[index])
except Exception as error:
yield TypeError(f'failed to parse {source} as {name!r}: {error}')
resolved[source] = target[index] = None
target[index] = None
continue
if source in resolved: