Commit graph

553 commits

Author SHA1 Message Date
Yen Chi Hsuan 81c13222c6
[utils] Recognize more formats in unified_timestamp
Used in CtsNews
2016-08-10 11:37:23 +08:00
Sergey M․ a8795327ca
[utils] Add support TV Parental Guidelines ratings in parse_age_limit 2016-08-07 20:45:18 +07:00
Yen Chi Hsuan d3f8e038fe
[utils] Add decode_png for openload (#9706) 2016-08-07 02:42:58 +08:00
Yen Chi Hsuan 7dc2a74e0a
[utils] Fix unified_timestamp for formats parsed by parsedate_tz() 2016-08-05 11:41:55 +08:00
Sergey M․ f164b97123
[utils] Add another f4m mimetype to mimetype2ext 2016-07-23 16:48:59 +07:00
Remita Amine e910fe2fe4 [brightcove] skip ism manifests 2016-07-14 14:13:57 +01:00
Yen Chi Hsuan 0b68de3cc1 Merge pull request #8876 from remitamine/html5_media
[extractor/common] add helper method to extract html5 media entries
2016-07-10 23:40:45 +08:00
Yen Chi Hsuan 84c237fb8a
[utils] Add get_element_by_class
For #9950
2016-07-06 20:02:52 +08:00
Remita Amine b4173f1551 [utils] add mimetypes to determine manifest ext(m3u8, f4m, mpd) 2016-07-06 09:06:28 +01:00
Remita Amine 81953d1ae5 [kaltura] add support videos stored on custom kaltura servers(closes #5557) 2016-07-04 17:59:58 +01:00
Sergey M․ 95cf60e826
[utils] Add PUTRequest 2016-07-03 02:21:32 +07:00
Aleksandar Topuzovic 6b03e1e25d
[HRTi] Implement extractor for Croatian Radiotelevision 2016-07-03 02:20:41 +07:00
remitamine 4f3c5e0627 [utils] add helper function for parsing codecs 2016-06-26 14:03:58 +01:00
Yen Chi Hsuan 1143535d76
[utils] Add urshift()
Used in IqiyiIE and LeIE
2016-06-26 15:16:49 +08:00
Sergey M․ b72b44318c
[utils] Add strip_or_none 2016-06-25 23:19:18 +07:00
Sergey M․ 46f59e89ea
[utils] Add unified_timestamp 2016-06-25 23:19:18 +07:00
remitamine e154c65128 [downloader/hls] Add support for AES-128 encrypted segments in hlsnative downloader 2016-06-19 01:01:40 +01:00
Yen Chi Hsuan 47212f7bcb
[utils] Don't transform numbers not starting with a zero
Fix test_Viidea and maybe others
2016-06-16 11:00:54 +08:00
Sergey M․ 329ca3bef6
[utils] Add try_get
To reduce boilerplate when accessing JSON
2016-06-12 06:05:34 +07:00
Paul Henning 15d106787e [utils] Change Firefox 44 to 47
See commit title.
2016-06-11 05:36:31 -04:00
Yen Chi Hsuan 55b2f099c0
[utils] Decode HTML5 entities
Used in test_Vporn_1. Also related to #9270
2016-06-10 15:11:55 +08:00
Yen Chi Hsuan 6c33d24b46
[utils] Add audio/mpeg to mimetype2ext()
Used in WDR live radios (#6147)
2016-06-09 12:58:24 +08:00
bzc6p c88270271e Added sanitization support for Hungarian letters Ő and Ű 2016-06-02 11:51:48 +02:00
Yen Chi Hsuan 9a4aec8b7e [utils] Use bytes-like objects as header values on Python 2 2016-06-02 15:00:49 +08:00
Yen Chi Hsuan 0ea590076f [utils] Always decode Location header
escape_url is broken for bytes-like objects
2016-06-02 15:00:49 +08:00
Yen Chi Hsuan 293c255688
[utils] Remove debugging codes 2016-05-26 22:54:16 +08:00
Yen Chi Hsuan 5950cb1d6d
[utils] Support a new form of date
Found in dw.com (#9475)
2016-05-26 22:44:00 +08:00
Sergey M․ c6b9cf05e1
[utils] Do not fail on unknown date formats in unified_strdate 2016-05-22 08:28:41 +06:00
Sergey M․ 46bc9b7d7c
[utils] Allow None in remove_{start,end} 2016-05-19 04:31:30 +06:00
Yen Chi Hsuan cdd94c2eae
[utils] Check for None values in SOCKS proxy
Originally reported at
https://github.com/rg3/youtube-dl/pull/9287#issuecomment-219617864
2016-05-17 14:38:15 +08:00
Yen Chi Hsuan 79298173c5
[utils] Fix getheader in urlhandle_detect_ext
Fixes #7049, related to #9440
2016-05-15 15:34:50 +08:00
Sergey M․ cda6d47aad
[utils] Simplify integer conversion in js_to_json 2016-05-14 23:41:57 +06:00
Sergey M․ 89ac4a19e6
[utils] Process non-base 10 integers in js_to_json 2016-05-14 20:39:58 +06:00
felix bd1e484448
[utils] js_to_json: various improvements
now JS object literals like { /* " */ 0: ",]\xaa<\/p>", } will be correctly converted to JSON.
2016-05-14 20:12:39 +06:00
Yen Chi Hsuan 7581bfc958
[utils] Unquote crendentials passed to SOCKS proxies
Fixes #9450
2016-05-13 00:27:25 +08:00
Yen Chi Hsuan 778a1ccca7
[utils] Add Œ and œ found in French to ACCENT_CHARS
Fixes #9463
2016-05-12 19:48:48 +08:00
Yen Chi Hsuan 702ccf2dc0
[compat] Rename shlex_quote and remove unused subprocess_check_output 2016-05-10 16:00:21 +08:00
Yen Chi Hsuan edaa23f822
[compat] Rename struct_(un)pack to compat_struct_(un)pack 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan d5ae6bb501
[utils] Add rationale for register_socks_protocols 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan 51fb4995a5
[utils] Register SOCKS protocols in urllib and support SOCKS4A 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan 71aff18809
[socks] Support SOCKS proxies 2016-05-10 14:51:38 +08:00
Yen Chi Hsuan dab0daeeb0
[utils,compat] Move struct_pack and struct_unpack to compat.py 2016-05-10 14:51:38 +08:00
Sergey M․ abc97b5eda
[utils] Allow empty attribute values in get_element_by_attribute (Closes #9415) 2016-05-06 22:07:30 +06:00
Adam Thalhammer c587cbb793 improved performance by extracting accented chars to top level 2016-05-03 10:40:30 +10:00
Adam Thalhammer 79a2e94e79 Instead of replacing accented characters with an underscore when sanitizing file names in restricted mode, replace them with their non-accented equivalents fixes #9347 2016-05-02 13:21:39 +10:00
Sergey M․ eb9ee19422
[utils] Allow None mimetypes in mimetype2ext 2016-04-25 00:03:12 +06:00
Sergey M b6c0d4f431 Merge pull request #9110 from remitamine/parse_duration
[utils] imporove parse_duration to handle more formats
2016-04-21 22:53:16 +07:00
remitamine acaff49575 [utils] imporove parse_duration to handle more formats 2016-04-21 16:34:54 +01:00
Yen Chi Hsuan cacd996662 [utils] Don't touch URLs if not necessary
Fix test_Generic_15 (Google redirect)
2016-04-09 19:27:54 +08:00
Jaime Marquínez Ferrándiz 5bf28d7864 [utils] dfxp2srt: add additional namespace
Used by the ZDF subtitles (#9081).
2016-04-04 20:46:35 +02:00
Sergey M․ 15d260ebaa [utils] Use update_Request in http_request 2016-03-31 22:55:49 +06:00
Sergey M․ ed0291d153 [utils] Add update_Request 2016-03-31 22:55:01 +06:00
Sergey M․ 17bcc626bf [utils] Extract sanitize_url routine 2016-03-26 19:33:57 +06:00
Sergey M․ 15707c7e02 [compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode

Closes #8974
2016-03-26 01:46:57 +06:00
Yen Chi Hsuan 622d19160b [utils] Clarify Python versions affected by buggy struct module 2016-03-24 18:06:15 +08:00
Yen Chi Hsuan efbed08dc2 [utils] Encode hostnames before passing to urllib
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError

Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz 782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string 2016-03-19 11:44:49 +01:00
Jaime Marquínez Ferrándiz 09fc33198a utils: lookup_unit_table: Use a stricter regex
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․ 810c10baa1 [utils] Use compat_xpath 2016-03-18 02:52:23 +06:00
Sergey M․ c5229f3926 [utils] PEP 8 2016-03-16 21:50:04 +06:00
remitamine 83548824c2 Merge pull request #8092 from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
Sergey M․ 2f7ae819ac [utils] PEP 8 2016-03-13 17:23:08 +06:00
Sergey M․ fb47597b09 [bbc] Generalize unit table lookup and add parse_count 2016-03-13 16:27:20 +06:00
Yen Chi Hsuan 25cb05bda9 [utils] Remove codec2ext
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
2016-03-11 23:51:42 +08:00
Yen Chi Hsuan 6d210f2090 [utils] Add more codecs to codec2ext
BBC uses avc3. Here's an example (thanks to @remitamine for this example)

http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd

See also https://trac.ffmpeg.org/ticket/5217
2016-03-06 17:57:48 +08:00
Yen Chi Hsuan 19a17d4623 [utils] Add codec2ext 2016-03-05 18:18:28 +08:00
Jaime Marquínez Ferrándiz 3233a68fbb [utils] update_url_query: Encode the strings in the query dict
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
2016-03-04 22:18:40 +01:00
remitamine 1255733945 Merge pull request #8739 from remitamine/update_url_params
[utils] add update_url_query function to create or update query string params
2016-03-03 19:24:04 +01:00
remitamine 38f9ef31dc [utils] add update_url_query function 2016-03-03 18:34:52 +01:00
Yen Chi Hsuan 0cae023b24 Merge branch 'jython-support'
Closes #8302
2016-03-03 18:49:32 +08:00
Yen Chi Hsuan 8ee239e921 [utils] Jython support - handle filenames correctly
Now test:youtube downloads
2016-03-03 18:47:54 +08:00
Brian Foley 8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
remitamine e07237f640 [utils] remove check for val from find_xpath_attr 2016-03-02 21:40:21 +01:00
Yen Chi Hsuan 5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Yen Chi Hsuan 680079be39 [utils] Relaxing regex in decode_packed_codes for vidzi 2016-02-26 15:13:03 +08:00
Yen Chi Hsuan f52354a889 [utils] Move codes for handling eval() from iqiyi.py 2016-02-26 14:58:29 +08:00
Yen Chi Hsuan 59f898b7a7 [utils] Merge base_n functions 2016-02-26 14:37:20 +08:00
Yen Chi Hsuan 481888294d [utils] Add base36 for use in Vidzi 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan 81bdc8fdf6 [utils] Move base62 to utils 2016-02-26 14:26:26 +08:00
Sergey M․ f160785c5c [utils] Remove AM/PM from unified_strdate patterns 2016-02-25 00:52:49 +06:00
Yen Chi Hsuan b95dc034ca [utils] Implement cache for OnDemandPagedList 2016-02-23 13:11:20 +08:00
remitamine cafcf657a4 add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction 2016-02-20 22:02:03 +01:00
Yen Chi Hsuan c1c05c67ea [utils] Jython support - disable setproctitle() until ctypes is complete 2016-02-21 03:32:03 +08:00
Yen Chi Hsuan 399a76e67b [utils] Jython support: tolerate missing fcntl module 2016-02-21 03:32:03 +08:00
Jaime Marquínez Ferrándiz 765ac263db [utils] mimetype2ext: return 'm4a' for 'audio/mp4' (fixes #8620)
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202 broke it).
2016-02-20 19:55:10 +01:00
Yen Chi Hsuan 5bc880b988 [utils] Add OHDave's RSA encryption function 2016-02-20 19:54:58 +08:00
Sergey M․ 611c1dd96e [refactor] Single quotes consistency 2016-02-14 15:37:17 +06:00
Sergey M․ d800609c62 [refactor] Do not specify redundant None as second argument in dict.get() 2016-02-14 14:25:04 +06:00
Sergey M․ 9c7b38981c [utils] Bump Firefox version in User-Agent
Old version number causes Youtube not to serve some formats in ytplayer.config
2016-02-11 23:12:30 +06:00
Sergey M․ 8411229bd5 [utils] Allow dot in strip_jsonp 2016-02-07 19:47:09 +06:00
Sergey M․ 86296ad2cd [utils] Add ability to control skipping false values in dict_get 2016-02-07 08:13:04 +06:00
Sergey M․ cbecc9b903 [utils] Add dict_get convenience method 2016-02-07 06:12:53 +06:00
Jaime Marquínez Ferrándiz 87de7069b9 [utils] dfxp2srt: make TTMLPElementParser inherit from object
For consistency between python 2 and 3.
2016-02-02 22:30:13 +01:00
remitamine 2b14cb566f [utils] fix dfxp2srt text extraction(fixes #8055) 2016-01-28 12:38:34 +01:00
Yen Chi Hsuan a0d8d704df [utils] Reorder items in mimetype2ext alphabetically 2016-01-25 01:01:15 +08:00
Yen Chi Hsuan f6861ec96f [utils] Add more items to mimetype2ext (#8293)
These are used in Youtube formats
2016-01-25 00:58:53 +08:00
remitamine 6ec6cb4e95 Revert "fix typos"
This reverts commit 36a0e46c39.
2016-01-10 19:27:22 +01:00
remitamine 36a0e46c39 fix typos 2016-01-10 17:55:41 +01:00
Jakub Wilk dfb1b1468c Fix typos
Closes #8200.
2016-01-10 17:24:28 +01:00
Sergey M․ a7aaa39863 [utils] Extract known extensions for reuse 2016-01-04 01:08:34 +06:00