对比二者可以发现,唯一的区别是精确搜索的<b title="Como La Flor">Como La Flor</b>中多包含了一项<span class="s-fc7"></span>。所以我此处的做法是在srchsongst中选取第一个item f-cb h-flag之后找到其sn选中其中的a即可得到歌曲的id信息
{"sgc":false,"sfy":false,"qfy":false,"transUser":{"id":12477352,"status":99,"demand":1,"userid":73446966,"nickname":"一片丹心向日葵","uptime":1604988576230},"lyricUser":{"id":12477349,"status":99,"demand":0,"userid":73446966,"nickname":"一片丹心向日葵","uptime":1604988562848},"lrc":{"version":9,"lyric":"[00:00.00] 作词 : A.B. Quintanilla III/Pete Astudillo\n[00:01.00] 作曲 : A.B. Quintanilla III/Pete Astudillo\n[00:17.23]Yo sé que tienes un nuevo amor\n[00:22.35]Sin embargo, te deseo lo mejor\n[00:27.55]Si en mi no encontraste felicidad\n[00:32.88]Tal vez alguien más te la dará\n[00:37.90]Como la flor (Como la flor)\n[00:40.35]Con tanto amor (Con tanto amor)\n[00:43.12]Me diste tú, se marchito\n[00:48.24]Me marcho hoy, yo sé perder\n[00:53.85]Pero, a-a-ay\n[00:56.30]Cómo me duele\n[00:59.35]A-a-ay\n[01:01.57]Cómo me duele\n[01:20.33]Si vieras como duele perder tu amor\n[01:25.60]Con tu adiós te llevas mi corazón\n[01:31.01]No sé si pueda volver a amar\n[01:36.13]Porque te di todo el amor que pude dar\n[01:41.03]Como la flor (Como la flor)\n[01:43.69]Con tanto amor (Con tanto amor)\n[01:46.32]Me diste tú, se marchito\n[01:51.44]Me marcho hoy, yo sé perder\n[01:56.83]Pero, a-a-ay\n[01:59.50]Cómo me duele\n[02:02.77]A-a-ay\n[02:04.77]Cómo me duele\n[02:15.14]Como la flor (Como la flor)\n[02:17.97]Con tanto amor (Con tanto amor)\n[02:20.44]Me diste tú, se marchito\n[02:25.66]Me marcho hoy, yo sé perder\n[02:31.33]Pero, a-a-ay\n[02:33.70]Cómo me duele\n[02:36.93]A-a-ay\n[02:38.89]Cómo me duele\n[02:42.30]A-a-ay\n[02:44.29]Cómo me duele\n"},"tlyric":{"version":10,"lyric":"[by:一片丹心向日葵]\n[00:17.23]我知道你已另寻新欢\n[00:22.35]尽管如此 我仍愿你一切安好\n[00:27.55]如果在我身上你找不到幸福\n[00:32.88]或许别人能够给予你\n[00:37.90]如一朵花(如一朵花)\n[00:40.35]载着万般爱意(载着万般爱意)\n[00:43.12]你赠予给我的花 也已枯萎\n[00:48.24]我必须离开 我知道我输了\n[00:53.85]但是啊\n[00:56.30]我真的很心痛\n[00:59.35]啊\n[01:01.57]我真的很痛\n[01:20.33]你无法想象失去你的爱我有多伤痛\n[01:25.60]随着你的道别 也带走了我的心\n[01:31.01]我不知道未来是否会再爱一次\n[01:36.13]因为我已把能给的爱都奉献给你\n[01:41.03]如一朵花(如一朵花)\n[01:43.69]载着万般爱意(载着万般爱意)\n[01:46.32]你赠予给我的花 也已枯萎\n[01:51.44]我必须离开 我知道我输了\n[01:56.83]但是啊\n[01:59.50]我真的很心痛\n[02:02.77]啊\n[02:04.77]我真的很痛\n[02:15.14]如一朵花(如一朵花)\n[02:17.97]载着万般爱意(载着万般爱意)\n[02:20.44]你赠予给我的花 也已枯萎\n[02:25.66]我必须离开 我知道我输了\n[02:31.33]但是啊\n[02:33.70]我真的很心痛\n[02:36.93]啊\n[02:38.89]我真的很痛\n[02:42.30]啊\n[02:44.29]我真的很痛"},"code":200}
defsplit_dict_keys(original_dict): new_dict = {} filted_new_dict = {} for key, value in original_dict.items(): # Splitting the key on the underscore front_part = key.split('_')[0] # Using the front part as the new key if key.split('_')[1] != "-1": # not found key may caused by network problem rather than not exist new_dict[front_part] = value filted_new_dict[key] = value return new_dict, filted_new_dict
updated_data = {} exist_keys = {} if os.path.exists(output_path): withopen(output_path, 'r', encoding='utf-8') as file2: json_data_exist = json.load(file2) exist_keys, updated_data = split_dict_keys(json_data_exist) # load data crawled before file2.close() # updated_data = json_data_exist # load data crawled before into current dict
temp_num = 0 for key, value in tqdm(json_data.items()): temp_num += 1 if key in exist_keys: # check if the key already in crawled data continue else: artist, song = next(iter(value.items())) search_str = f"{artist}: {song}" song_id = None
try: song_id = gotoSearchID(search_str) except Exception as e: song_id = -1# to skip the data not found print("something went wrong:", e, "\n") if"I/O operation on closed file"instr(e): break
if song_id isnotNone: new_key = f"{key}_{song_id}" updated_data[new_key] = {artist : song} if temp_num % 10 == 0: withopen(output_path, "w", encoding='utf-8') as file1: json.dump(updated_data, file1, indent=4, ensure_ascii=False) file1.close() time.sleep(1)
import os import json import requests from tqdm import tqdm
defget_filenames_dict(folder_path): # Dictionary to store filenames without extension filenames_dict = {}
# Iterate over all files in the folder for filename in os.listdir(folder_path): # Check if it's a file and not a directory if os.path.isfile(os.path.join(folder_path, filename)): # Split the filename from its extension and add to the dictionary name, _ = os.path.splitext(filename) filenames_dict[name] = None# Or any other default value
return filenames_dict
defget_key_and_id(ori_key): key = ori_key.split('_')[0] id = ori_key.split('_')[1] return key, id
something went wrong: Message: Stacktrace: GetHandleVerifier [0x00007FF6EE2D82B2+55298] (No symbol) [0x00007FF6EE245E02] (No symbol) [0x00007FF6EE1005AB] (No symbol) [0x00007FF6EE14175C] (No symbol) [0x00007FF6EE1418DC] (No symbol) [0x00007FF6EE17CBC7] (No symbol) [0x00007FF6EE1620EF] (No symbol) [0x00007FF6EE17AAA4] (No symbol) [0x00007FF6EE161E83] (No symbol) [0x00007FF6EE13670A] (No symbol) [0x00007FF6EE137964] GetHandleVerifier [0x00007FF6EE650AAB+3694587] GetHandleVerifier [0x00007FF6EE6A728E+4048862] GetHandleVerifier [0x00007FF6EE69F173+4015811] GetHandleVerifier [0x00007FF6EE3747D6+695590] (No symbol) [0x00007FF6EE250CE8] (No symbol) [0x00007FF6EE24CF34] (No symbol) [0x00007FF6EE24D062] (No symbol) [0x00007FF6EE23D3A3] BaseThreadInitThunk [0x00007FFCA6CF7344+20] RtlUserThreadStart [0x00007FFCA73626B1+33]
2:
1
something went wrong: can only concatenate str (not "TimeoutException") to str
try: song_id = gotoSearchID(search_str) except Exception as e: song_id = -1# to skip the data not found print("something went wrong:", e, "\n") if"I/O operation on closed file"instr(e): break
os.environ["webdriver.chrome.silentOutput"] = "true"# I donnot this line make sense or not chrome_options.add_argument("--log-level=3") # make it silent
chrome_options.add_argument("--headless") # Run Chrome in headless mode chrome_options.add_argument("--no-sandbox") # Bypass OS security model chrome_options.add_argument("--disable-dev-shm-usage") # Overcome limited resource problems chrome_options.add_argument('--disable-gpu') # Disable GPU for headless mode
os.environ["webdriver.chrome.silentOutput"] = "true" chrome_options.add_argument("--log-level=3") # make it silent chrome_service = ChromeService(executable_path=r"chromedriver.exe", log_path=os.devnull) # make it totally silent but not work