Karaoke Library Cleanup and Merging: Difference between revisions

From Pikes' Wiki
Jump to navigation Jump to search
(Created page with "=Process= ==File Name Issues== * Combine cdg+mp3 into zip * Standard naming should be: <trackid> - <artist> - <title>.zip ** Use ",The" naming where appropriate ** Use Last, First for <artist> ==Find Duplicates== * Look for duplicate <trackID><br/>Assume an exact duplicate in the library is good unless clearly marked otherwise. * Search for duplicate ** Artist+Title ** Soundtrack+Title (e.g. Grease - Greased Lightning)")
 
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Process=
=Process=
==File Content Issues==
*Are there folders?
*Do the contents match the zip name?
*Has uppercase MP3/CDG?
*More than 2 files in zip?
==File Name Issues==
==File Name Issues==
* Combine cdg+mp3 into zip
* Rename folders which end in .zip/mp3/cdg to remove the extensions
<pre>
find . -type d \( -iname "*.zip" -o -iname "*.mp3" -o -iname "*.cdg" \) | while read fn ; do
mv "$fn" "${fn%.*}" #note the single percent so we only remove the final . and extension
done
</pre>
 
* Combine bare matching .mp3 and .cdg files into zip. We look for *.mp3 since a "bare" cdg (i.e. "missing" mp3) is useless. The script silently fails if the cdg is missing.
<pre>
find <srcdir> -name "*.mp3" | zip_mp3cdg
</pre>
 
*Make sure all zip files end in lowercsae .zip
<pre>
find . -iname "*.zip" ! -name "*.zip" | while read fn ; do
rename -f 's/\.zip$/.zip/i' "$fn"
done
</pre>
 
*Find files with underscore (_) multiple adjacent spaces or space at the end of the name
<pre>
find . -iregex '.*/[^/]*\(_\|  +\)[^/]*\.zip' -o -name "* .zip"
| rn_zip_stripchars
</pre>
 
*Clean up files with more than three " - " fields
<pre>
find . -name "* - * - * - *.zip"
</pre>
 
*Find files with less than three fields
<pre>
find . -name "*.zip" ! -name "* - * - *.zip"
</pre>
 
*Files that don't match DiskID - field 1 - field 2.zip
<pre>
find . -name "*.zip" ! -iregex '.*/.*-[0-9][0-9]+ - .* - .*\.zip'
</pre>
 
*Move files matching "diskid - field 1 - field 2.zip" to Work01
<pre>
find . -iregex '.*/.*-[0-9][0-9]+ - .* - .*\.zip'
</pre>
 
 
* Standard naming should be: <trackid> - <artist> - <title>.zip
* Standard naming should be: <trackid> - <artist> - <title>.zip
** Use ",The" naming where appropriate
** Use ",The" naming where appropriate
** Use Last, First for <artist>
** Use Last, First for <artist>
==Find Duplicates==
==Find Duplicates==
* Look for duplicate <trackID><br/>Assume an exact duplicate in the library is good unless clearly marked otherwise.
* Look for duplicate <trackID><br/>Assume an exact duplicate in the library is good unless clearly marked otherwise.
Line 10: Line 62:
** Artist+Title
** Artist+Title
** Soundtrack+Title (e.g. Grease - Greased Lightning)
** Soundtrack+Title (e.g. Grease - Greased Lightning)
=Scripts=
* mv_file_by_pub
* mv_tree
* rn_by_artist
* rn_by_artist_interactive
* rn_by_count
* rn_zip_by_content
* rn_zip_contents
* rn_zip_interactive
* rn_zip_stripchars
* rn_zip_swap_fields_interactive
* zip_fc_fix
* zip_find
* zip_flatten
* zip_matchsub
* zip_mp3cdg
* zip_test

Latest revision as of 04:31, 19 December 2021

Process

File Content Issues

  • Are there folders?
  • Do the contents match the zip name?
  • Has uppercase MP3/CDG?
  • More than 2 files in zip?

File Name Issues

  • Rename folders which end in .zip/mp3/cdg to remove the extensions
find . -type d \( -iname "*.zip" -o -iname "*.mp3" -o -iname "*.cdg" \) | while read fn ; do
	mv "$fn" "${fn%.*}"			#note the single percent so we only remove the final . and extension
	done
  • Combine bare matching .mp3 and .cdg files into zip. We look for *.mp3 since a "bare" cdg (i.e. "missing" mp3) is useless. The script silently fails if the cdg is missing.
find <srcdir> -name "*.mp3" | zip_mp3cdg
  • Make sure all zip files end in lowercsae .zip
find . -iname "*.zip" ! -name "*.zip" | while read fn ; do
	rename -f 's/\.zip$/.zip/i' "$fn"
	done
  • Find files with underscore (_) multiple adjacent spaces or space at the end of the name
find . -iregex '.*/[^/]*\(_\|  +\)[^/]*\.zip' -o -name "* .zip"
	| rn_zip_stripchars
  • Clean up files with more than three " - " fields
find . -name "* - * - * - *.zip"
  • Find files with less than three fields
find . -name "*.zip" ! -name "* - * - *.zip"
  • Files that don't match DiskID - field 1 - field 2.zip
find . -name "*.zip" ! -iregex '.*/.*-[0-9][0-9]+ - .* - .*\.zip'
  • Move files matching "diskid - field 1 - field 2.zip" to Work01
find . -iregex '.*/.*-[0-9][0-9]+ - .* - .*\.zip'


  • Standard naming should be: <trackid> - <artist> - <title>.zip
    • Use ",The" naming where appropriate
    • Use Last, First for <artist>

Find Duplicates

  • Look for duplicate <trackID>
    Assume an exact duplicate in the library is good unless clearly marked otherwise.
  • Search for duplicate
    • Artist+Title
    • Soundtrack+Title (e.g. Grease - Greased Lightning)

Scripts

  • mv_file_by_pub
  • mv_tree
  • rn_by_artist
  • rn_by_artist_interactive
  • rn_by_count
  • rn_zip_by_content
  • rn_zip_contents
  • rn_zip_interactive
  • rn_zip_stripchars
  • rn_zip_swap_fields_interactive
  • zip_fc_fix
  • zip_find
  • zip_flatten
  • zip_matchsub
  • zip_mp3cdg
  • zip_test