Downloading, splicing and converting Youku videos

As it seems that Torrents are becoming less popular these days, the only sure place I can be sure to find the latest GnT episode (the day after it airs, and of reasonable quality) is on Youku. The only question was how to save the video in a form I could stream to PS3 or Xbox.
Just this week, JDownloader updated itself, and the answer came. It can now be used to download the fairly high quality Youku videos posted very soon after GnT airs (as well as numerous other high quality programs, like Lincoln, DowntownDX, and other favorites like ItteQ and OgonDansetsu.)

But downloading Youku via JDownloader splits the file (a .flv or mp4) into numerous _part-0X. files. The question was how to stitch them together again.
In order to do this, I’ve bee using avidemux (actually avidemux2_cli in ubuntu linux) and a handy perl script. The script takes one video file title from the sequence, and will locate all the other corresponding files in the same directory, feed the whole lot into avidemux, and spit out an mp4 file.
I’ll include the script below-- it could use some work (like improving the regular expressions) but is working a treat for me right now. I hope someone else might get some use out of it as well.

[code:nxk8dvry]
#!/usr/bin/perl -w
#Friday, September 9th 2011
#-------------------------------------------------------------

joinvideoseq script by on_three

#-------------------------------------------------------------
#Stitch together part_0X video files in the same directory
#and save the joined files as a single mp4.
#this script requires avidemux2_cli, and has only been
#tested on linux.

#settings
$AUDIO_CODEC="copy";
$VIDEO_CODEC="copy";
$OUTPUT_FORMAT="mp4";

(1) quit unless we have the correct number of command-line args

$num_args = $#ARGV + 1;
if ($num_args != 1) {
print "\nUSAGE: <file in sequence>\n";
exit;
}

Catch the input file-- first and only argument

$INPUT_FILE=$ARGV[0];

#Test the input filename to see if it matches the pattern we expect
if ($INPUT_FILE =~ m/^._part.&#46;.*$/) {

} else {
print "Input filename does not match expected pattern.";
exit;
}

#Form the base filename from the input file
$BASEFILENAME = $INPUT_FILE;
$BASEFILENAME =~ s/^(.)_part.&#46;.*$/$1/;

#print $BASEFILENAME

$OUTPUT_FILENAME="$BASEFILENAME.$OUTPUT_FORMAT";
#print $NEW_FILENAME

#Collect all files in directory that match file base pattern.
opendir(DIR, ".");
@files = grep(/^$BASEFILENAME.*$/,readdir(DIR));
closedir(DIR);

#make sure the found filenames are in numerical order
@files = sort(@files);

print all the filenames in our array

#foreach $file (@files) {

print "$file\n";

#}

#form our command array to execute
@command = ();
push(@command,"avidemux2_cli");
push(@command,"–audio-codec");
push(@command,$AUDIO_CODEC);
push(@command,"–video-codec");
push(@command,$VIDEO_CODEC);
push(@command,"–output-format");
push(@command,$OUTPUT_FORMAT);
push(@command,"–load");
push(@command,@files[1]);
for ($count = 1; $count < @files; $count++) {
push(@command,"–append");
push(@command,@files[$count]);
}
push(@command,"–save");
push(@command,$OUTPUT_FILENAME);

#foreach $word (@command)
#{

print"$word\n";

#}
#FINALLY execute our command
system(@command)
[/code:nxk8dvry]

– 11.09.2011, 16:01 –

I’ve been using the above script for about three days, but ran into some trouble streaming the resultant files to PS3. Although I was able to salvage the section that actually forms the sequential file list to process, I had to change the programs that actually converted the file containers from .flv to .mp4.

I’ve very much pleased with the improvement, and I’ve included the updated file below. It now relies upon command-line "ffmpeg" and "MP4Box." Again, I’ve only done this on linux ubuntu. In that case, I had to update my copies of ffmpeg and MP4Box via the Mediabuntu repository. This is because the version of those programs in the normal Ubuntu repository don’t include H264 or AAC codecs, both of which are used in Youku FLV and MP4 contaner files.

[code:nxk8dvry]
#!/usr/bin/perl -w
#Friday, September 9th 2011
#-------------------------------------------------------------

joinvideoseq script by on_three

#-------------------------------------------------------------
#Stitch together part_0X video files in the same directory
#and save the joined files as a single mp4.
#this script requires avidemux2_cli, and has only been
#tested on linux.

#Saturday, Sept 10th 2011
#-------------------------------------------------------------
#+Updated search for similar files in sequence (line 590
#+Added seperate handling of mp4 input via mencoder application
#+Added --nogui option to avidemux2_cli calls for better, automated running.
#+Added checking for no file sequence found.
#-Still some problems with audio synch when using mencoder to join mp4s.

#Sunday, Sept 11th 2001 (;_:wink:
#-------------------------------------------------------------

* Script version 2.0. Compete revision

Script now relies upon two command line applications

1) ffmpeg (with h.264 and AAC support. Ubuntu users might have to

update their vervions via the mediabuntu repository

2) MP4Box (again ubuntu might have to get the mediabuntu version)

These changes were brought in because stuttering and sound

sync problems were seen in some output files, and PS3 playback

of the previous output was quite awful. These ought not be

a problem anymore.

#settings
$AUDIO_CODEC="copy";
$VIDEO_CODEC="copy";
$OUTPUT_FORMAT="mp4";

(1) quit unless we have the correct number of command-line args

$num_args = $#ARGV + 1;
if ($num_args != 1) {
print "\nUSAGE: <file in sequence>\n";
exit;
}

Catch the input file-- first and only argument

$INPUT_FILE=$ARGV[0];

#Test the input filename to see if it matches the pattern we expect
if ($INPUT_FILE =~ m/^._part.&#46;.*$/) {

} else {
print "Input filename does not match expected pattern.";
exit;
}

#Form the base filename from the input file
$BASEFILENAME = $INPUT_FILE;
$BASEFILENAME =~ s/^(.)_part.&#46;.*$/$1/;
print "The basic pattern for the sequence we’ll look for is: $BASEFILENAME \n";

#We also want to strip off the original file extension, for do different handling of .flv and .mp4 input.
$INPUT_FORMAT = $INPUT_FILE;
$INPUT_FORMAT =~ s/^._part.&#46;(.*)$/$1/;

print "The input format is $INPUT_FORMAT\n";

#print $BASEFILENAME

$OUTPUT_FILENAME="$BASEFILENAME.$OUTPUT_FORMAT";
#print $NEW_FILENAME

#Collect all files in directory that match file base pattern.
opendir(DIR, ".");
#as we’ll use the base filename in our regex, we need to "quotemeta"
#in order to keep out any problematic characters in the name that
#the regex might reinterpret (parentheses in one name prompted me to add this)
$REGEX_BASEFILENAME = quotemeta($BASEFILENAME);
@files = grep(/^$REGEX_BASEFILENAME.*$/,readdir(DIR));
closedir(DIR);

#make sure the found filenames are in numerical order
@files = sort(@files);

#Make sure we’ve found a series to process
if( @files < 2 )
{
print "Sorry, we’ve haven’t found a sequence of videos to process\n";
exit;
}

print "Found the following files to join together (in this order):\n";
foreach $file (@files) {
print "$file\n";
}

#Execute differently, depending on whether the input file container was .mp4 or .flv

if( lc($INPUT_FORMAT) eq "flv" )
{
my @tempfiles=();
#ffmpeg -i ‘クイズ!ヘキサゴンII - 11.09.07_part-06.flv’ -vcodec copy -acodec copy ‘クイズ!ヘキサゴンII - 11.09.07_part-06.mp4’
for ($count = 0; $count < @files; $count++) {
my @command = ();
my $output_filename = "$files[$count].mp4";
push(@command,"ffmpeg");
push(@command,"-i");
push(@command,$files[$count]);
push(@command,"-acodec");
push(@command,"$AUDIO_CODEC");
push(@command,"-vcodec");
push(@command,$VIDEO_CODEC);
push(@command,$output_filename);

	push(@tempfiles,$output_filename);

	system(@command);#convert this &#46;flv to a temp &#46;mp4
}

#now join up all the temporary &#46;mp4 files, and delete the temporary ones
JoinMP4Files(@tempfiles);
#remove our temporary files
unshift(@tempfiles,&quot;rm&quot;);#unshift pushes to the FONT of the array&#46;
system(@tempfiles);

}elsif( lc($INPUT_FORMAT) eq "mp4"){
JoinMP4Files(@files)
}

sub JoinMP4Files
{
#sub exepects an array of input files to join
my @files = @_;

#make system call to MP4Box to join files of form&#58;
#MP4Box -add 'ガキの使い(2009&#46;07&#46;26)_part-00&#46;mp4' -cat 'ガキの使い(2009&#46;07&#46;26)_part-01&#46;mp4' -cat 'ガキの使い(2009&#46;07&#46;26)_part-02&#46;mp4' -cat 'ガキの使い(2009&#46;07&#46;26)_part-03&#46;mp4' -new 'ガキの使い(2009&#46;07&#46;26)&#46;mp4'
my @command = ();
push(@command,&quot;MP4Box&quot;);
push(@command,&quot;-add&quot;);
push(@command,$files&#91;0&#93;);
for ($count = 1; $count &lt; @files; $count++) {
		push(@command,&quot;-cat&quot;);
		push(@command,$files&#91;$count&#93;);
	}
push(@command,&quot;-new&quot;);
push(@command,$OUTPUT_FILENAME);

system(@command)

}

[/code:nxk8dvry]

Here is a simple batch script for all Windows users. You’ll need ffmpeg also you’ll need to edit both the batch and the filenames according to your setup and how many parts the video has. I’m a batch noob so no fancy dynamic filename detection for you (if that’s even possible in batch :shake: )

[code:zwcj19pj]ffmpeg -y -i "1.flv" -vcodec copy -vbsf h264_mp4toannexb -acodec copy part1.ts
ffmpeg -y -i "2.flv" -vcodec copy -vbsf h264_mp4toannexb -acodec copy part2.ts
ffmpeg -y -i "3.flv" -vcodec copy -vbsf h264_mp4toannexb -acodec copy part3.ts
ffmpeg -y -i "4.flv" -vcodec copy -vbsf h264_mp4toannexb -acodec copy part4.ts
ffmpeg -y -i concat:"part1.ts|part2.ts|part3.ts|part4.ts" -vcodec copy -acodec copy -absf aac_adtstoasc output.mp4
DEL part1.ts part2.ts part3.ts part4.ts[/code:zwcj19pj]

Hand,
It might be possible to put the ffmpeg calls you’re making into the perl script I’ve written above to make a windows specific version. You would just need to install a perl interpreter into windows (I use Strawberry perl on Windows at work) and then add your lines to the section where the system command is built (about line 100+)

I don’t know if I’m going to be able to find the time to do so, but I’m thinking of writing a plugin for JDownloader that could automatically run a given script after download completion. It could then attempt to run a "stiching" script automatically and then copy the resultant file to a target directory.

It would have to be in Java (I’m guessing) which I’m not very good at-- so I doubt I’d be able to find the time. Still, if there’s any Java mavens about, perhaps we could all contribute?

Maybe someone should contact the JDownloader team about this. They are really eager to add user requested features.

I did some research into JDownloader, and it seems their roadmap includes the development of a new "event" manager-- an extension which could run a custom script (using variables defined off the download name, etc.) at download completion. I can’t judge progress, but it appears to be in the pipeline.
I also thought I might try to code something in the meantime, but I couldn’t find any resources on coding plugins for JDownloader. I don’t want to make a meal of it, so I guess I’ll just wait for the Event Manager.