Guru's notes on Unix, SCM, PHP and Perl: December 2010

Friday, December 31, 2010

How to Extract metadata from audio files via command line using Perl ?

$ cat test.pl  
use MP3::Tag; 
# set filename of MP3 track 
my $filename = "test.mp3"; 
# create new MP3-Tag object 
my $mp3 = MP3::Tag->new($filename); 
# get tag information 
my ($title, $track, $artist, $album, $comment, $year, $genre) = $mp3->autoinfo(); 
print "$title, $track, $artist, $album, $comment, $year, $genre\n";  

$ perl test.pl 

Nakka Mukka, 6/9, Chinnaponnu & Nakulan, Kadhalil Vizhundhen, Digitised by Guru Ashok, 2008, Tamil

How to install a module from CPAN?

The easiest way is to have a module also named CPAN do it for you. This module comes with perl version 5.004 and later.

$ perl -MCPAN -e shell
  cpan shell -- CPAN exploration and modules installation(v1.59_54)        
  ReadLine support enabled      
cpan> install Some::Module

To manually install the CPAN module, or any well-behaved CPAN module for that matter, follow these steps:

Unpack the source into a temporary area.

perl Makefile.PL     
make     
make test     
make install

Wednesday, December 29, 2010

PHP and XML - Create, Add, Edit, Modify using DOM , SimpleXML

Over the last few working days, I spent a quite a bit of time playing around with XML. While searching through the net, I found few comprehensive PHP XML guides. There never was a ’1 stop all operations’ guide for learning XML.

As such I decided to club together examples of all kinds of operations I ever did on XML in a single post. I hope it benefits others out there who wish to learn more about XML manipulation.

Note : Since the post got quite large, I decided to only use the Tree Map style parsers – DOM & Simple XML.

Operations Performed:

(1) Create XML OR Array to XML Conversion OR CDATA Element Eg

(2) Edit XML – Edit/Modify Element Data (accessed serially)

(3) Edit XML – Edit specific Elements (accessed conditionally)

(4) Edit XML – Element Addition (to queue end)

(5) Edit XML – Element Addition (to queue start)

(6) Edit XML – Element Addition (before a specific node)

(7) Delete Elements (accessed serially)

(8) Delete Elements (accessed conditionally)

(9) Rearrange / Reorder Elements

(10) Display Required data in XML Form itself OR Remove all children nodes save one OR Copy/Clone Node Eg OR Compare/Search non numeric data (like date or time) to get result.

library.xml will be used in all operations.

ps : I have added the indention & spaces outside the tags in the below xml for a presentable xml form.

Remove them before saving your xml file else most of the usual XML functions wont work in the desired manner.

<?xml version="1.0"?>
<library>
    <book isbn="1001" pubdate="1943-01-01">
        <title><![CDATA[The Fountainhead]]></title>
        <author>Ayn Rand</author>
        <price>300</price>
    </book>
    <book isbn="1002" pubdate="1954-01-01">
        <title><![CDATA[The Lord of the Rings]]></title>
        <author>J.R.R.Tolkein</author>
        <price>500</price>
    </book>
    <book isbn="1003" pubdate="1982-01-01">
        <title><![CDATA[The Dark Tower]]></title>
        <author>Stephen King</author>
        <price>200</price>
    </book>
</library>

#######################################

// (1) Create XML OR

Array to XML Conversion OR

CDATA Element Eg

#######################################

// (i) SimpleXML :

// Cant create CDATA element for title in SimpleXML.
function fnSimpleXMLCreate()
    {
        $arr = array(array('isbn'=>'1001', 'pubdate'=>'1943-01-01', 'title'=>'The Fountainhead',
                               'author'=>'Ayn Rand', 'price'=>'300'),
                         array('isbn'=>'1002', 'pubdate'=>'1954-01-01',
                               'title'=>'The Lord of the Rings', 'author'=>'J.R.R.Tolkein',
                               'price'=>'500'),
                         array('isbn'=>'1003', 'pubdate'=>'1982-01-01', 'title'=>'The Dark Tower',
                               'author'=>'Stephen King', 'price'=>'200'));

        $library = new SimpleXMLElement('<library />');

        for($i=0;$i<3;$i++)
        {
            $book = $library->addChild('book');
            $book->addAttribute('isbn', $arr[$i]['isbn']);
            $book->addAttribute('pubdate', $arr[$i]['pubdate']);
            $book->addChild('title', $arr[$i]['title']); //cant create CDATA in SimpleXML.
            $book->addChild('author', $arr[$i]['author']);
            $book->addChild('price', $arr[$i]['price']);
        }

        $library->asXML('library.xml');
    }

// (ii) DOM :

function fnDomCreate()
    {
       $arr = array(array('isbn'=>'1001', 'pubdate'=>'1943-01-01', 'title'=>'The Fountainhead',
                               'author'=>'Ayn Rand', 'price'=>'300'),
                         array('isbn'=>'1002', 'pubdate'=>'1954-01-01',
                               'title'=>'The Lord of the Rings', 'author'=>'J.R.R.Tolkein',
                               'price'=>'500'),
                         array('isbn'=>'1003', 'pubdate'=>'1982-01-01', 'title'=>'The Dark Tower',
                               'author'=>'Stephen King', 'price'=>'200'));

        $dom = new DOMDocument();
        $library = $dom->createElement('library');
        $dom->appendChild($library);

        for($i=0;$i<3;$i++)
        {
            $book = $dom->createElement('book');
            $book->setAttribute('isbn',$arr[$i]['isbn']);
             $book->setAttribute('pubdate',$arr[$i]['pubdate']);

            //$prop = $dom->createElement('title', $arr[$i]['title']);
            $prop = $dom->createElement('title');
            $text = $dom->createCDATASection($arr[$i]['title']);
            $prop->appendChild($text);
            $book->appendChild($prop);

            $prop = $dom->createElement('author', $arr[$i]['author']);
            $book->appendChild($prop);
            $prop = $dom->createElement('price', $arr[$i]['price']);
            $book->appendChild($prop);
            $library->appendChild($book);
        }
        //header("Content-type: text/xml");
        $dom->save('library.xml');
    }

#######################################

// (2) Edit XML – Edit/Modify Element Data (accessed serially)

#######################################

// (i) SimpleXML :

// Edit Last Book Title
function fnSimpleXMLEditElementSeq()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        $num = count($library);
        $library->book[$num-1]->title .= ' - The Gunslinger';
        header("Content-type: text/xml");
        echo $library->asXML();
    }

// (ii) DOM :

//Edit Last Book Title
    function fnDOMEditElementSeq()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;
        $cnt = $library->childNodes->length;

        $library->childNodes->item($cnt-1)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';
       // 2nd way #$library->getElementsByTagName('book')->item($cnt-1)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';

       //3rd Way
       //$library->childNodes->item($cnt-1)->childNodes->item(0)->nodeValue .= ' Series';
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (3) Edit XML – Edit specific Elements (accessed conditionally)

#######################################

// (i) SimpleXML :

//Edit Title of book with author J.R.R.Tolkein
    function fnSimpleXMLEditElementCond()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        $book = $library->xpath('/library/book[author="J.R.R.Tolkein"]');
        $book[0]->title .= ' Series';
        header("Content-type: text/xml");
        echo $library->asXML();
    }

// (ii) DOM (with XPath):

 //Edit Title of book with author J.R.R.Tolkein
    function fnDOMEditElementCond()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;
        $xpath = new DOMXPath($dom);
        $result = $xpath->query('/library/book[author="J.R.R.Tolkein"]/title');
        $result->item(0)->nodeValue .= ' Series';
        // This will remove the CDATA property of the element.
        //To retain it, delete this element (see delete eg) & recreate it with CDATA (see create xml eg).

        //2nd Way
        //$result = $xpath->query('/library/book[author="J.R.R.Tolkein"]');
       // $result->item(0)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';
        header("Content-type: text/xml");
        echo $dom->saveXML();

    }

#######################################

// (4) Edit XML – Element Addition (to queue end)

#######################################

// (i) SimpleXML :

//Add another Book to the end
    function fnSimpleXMLAddElement2End()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        $book = $library->addChild('book');
        $book->addAttribute('isbn', '1004');
        $book->addAttribute('pubdate', '1960-07-11');
        $book->addChild('title', "To Kill a Mockingbird");
        $book->addChild('author', "Harper Lee");
        $book->addChild('price', "100");
        header("Content-type: text/xml");
        echo $library->asXML();
    }

// (ii) DOM :

    //Add another Book to the end
    function fnDOMAddElement2End()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;

        $book = $dom->createElement('book');
        $book->setAttribute('isbn','1000');
        $book->setAttribute('pubdate','1960-07-11');

        $prop = $dom->createElement('title');
        $text = $dom->createTextNode('To Kill a Mockingbird');
        $prop->appendChild($text);
        $book->appendChild($prop);

         $prop = $dom->createElement('author','Harper Lee');
        $book->appendChild($prop);
        $prop = $dom->createElement('price','100');
        $book->appendChild($prop);

        $library->appendChild($book);
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

//(5) Edit XML – Element Addition (to queue start)

#######################################

// (i) SimpleXML :

// Add a Book to List Start
// Insert Before Functionality not present in SimpleXML
// We can integrate DOM with SimpleXML to do it.
    function fnSimpleXMLAddElement2Start()
    {
        $libSimple = new SimpleXMLElement('library.xml',null,true);
        $libDom = dom_import_simplexml($libSimple);

        $dom = new DOMDocument();
        //returns a copy of the node to import
        $libDom = $dom->importNode($libDom, true);
        //associate it with the current document.
        $dom->appendChild($libDom);

        fnDOMAddElement2Start($dom); //see below DOM function
    }

// (ii) DOM :

function fnDOMAddElement2Start($dom='')
    {
        if(!$dom)
        {
            $dom = new DOMDocument();
            $dom->load('library.xml');
        }
        $library = $dom->documentElement;
        #var_dump($library->childNodes->item(0)->parentNode->nodeName);
        $book = $dom->createElement('book');
        $book->setAttribute('isbn','1000');
        $book->setAttribute('pubdate','1960-07-11');

        $prop = $dom->createElement('title','To Kill a Mockingbird');
        $book->appendChild($prop);
         $prop = $dom->createElement('author','Harper Lee');
        $book->appendChild($prop);
         $prop = $dom->createElement('price','100');
        $book->appendChild($prop);

        $library->childNodes->item(0)->parentNode->insertBefore($book,$library->childNodes->item(0));
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (6) Edit XML – Element Addition (before a specific node)

#######################################

// (i) SimpleXML :

// Add a Book Before attribute isbn 1002
    // Insert Before Functionality not present in SimpleXML
    // We can integrate DOM with SimpleXML to do it.
    function fnSimpleXMLAddElementCond()
    {
        $libSimple = new SimpleXMLElement('library.xml',null,true);
        $libDom = dom_import_simplexml($libSimple);

        $dom = new DOMDocument();
        //returns a copy of the node to import
        $libDom = $dom->importNode($libDom, true);
        //associate it with the current document.
        $dom->appendChild($libDom);

        fnDOMAddElementCond($dom); //see below DOM eg.
    }

// (ii) DOM :

// Add a Book Before isbn 1002
    function fnDOMAddElementCond($dom='')
    {
        if(!$dom)
        {
            $dom = new DOMDocument();
            $dom->load('library.xml');
        }
        $library = $dom->documentElement;

        $book = $dom->createElement('book');
        $book->setAttribute('isbn','1000');
        $book->setAttribute('pubdate', '1960-07-11');

        $prop = $dom->createElement('title','To Kill a Mockingbird');
        $book->appendChild($prop);
         $prop = $dom->createElement('author','Harper Lee');
        $book->appendChild($prop);
        $prop = $dom->createElement('price','100');
        $book->appendChild($prop);

        $xpath = new DOMXPath($dom);
        $result = $xpath->query('/library/book[@isbn="1002"]');
        $library->childNodes->item(0)->parentNode->insertBefore($book,$result->item(0));
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (7) Delete Elements (accessed serially)

#######################################

// (i) SimpleXML :

// Delete 2nd book
    function fnSimpleXMLDeleteSeq()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        //$library->book[1] = null; // this only empties content
        unset($library->book[1]);
        header("Content-type: text/xml");
        echo $library->asXML();

    }

// (ii) DOM :

// Delete 2nd Book
    function fnDOMDeleteSeq()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;

        $library->childNodes->item(0)->parentNode->removeChild($library->childNodes->item(1));

        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (8) Delete Elements (accessed conditionally)

#######################################

// (i) SimpleXML :

// Delete a book with  200<price<500
    // Not possible to delete node found via XPath in SimpleXML. See below.
    function fnSimpleXMLDeleteCond()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        $book = $library->xpath('/library/book[price>"200" and price<"500"]');

        //Problem here....not able to delete parent node using unset($book[0]);
        // unset of parent node only works when accessing serially. eg : unset($library->book[0]);

        //header("Content-type: text/xml");
        //echo $library->asXML();

    }

// (ii) DOM :

// Delete the book with  200<price<500
    function fnDOMDeleteCond()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;
        $xpath = new DOMXPath($dom);
        $result = $xpath->query('/library/book[price>"200" and price<"500"]');
        $result->item(0)->parentNode->removeChild($result->item(0));
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (9) Rearrange / Reorder Elements

#######################################

// (i) SimpleXML :

 // Exchange Position of 2nd book with 3rd.
// Due to absence of an inbuilt function (DOM has it), we have to make our own function in SimpleXML. Better to use DOM.
    function fnSimpleXMLRearrange()
    {
         $library = new SimpleXMLElement('library.xml',null,true);
         //$library->book[3] = $library->book[0]; // this doesnt work

         $cnt = count($library);
         // Custom function which basically uses a 3rd container to exchange nodes data.
         fnNodeExchange($library,2,1);
         //$library->book[2] = $temp;
         header("Content-type: text/xml");
        echo $library->asXML();
    }
function fnNodeExchange(&$library,$node1,$node2)
    {
        $cnt = count($library);

        foreach($library->book[$node1]->children() as $book)
         {
            $name = $book->getName();
            $library->book[$cnt]->$name = $book[0];
         }
         foreach($library->book[$node1]->attributes() as $book)
         {
            $name = $book->getName();
            $library->book[$cnt][$name] = $book[0];
         }
         foreach($library->book[$node2]->children() as $book)
         {
            $name = $book->getName();
            $library->book[$node1]->$name = $book[0];
         }
         foreach($library->book[$node2]->attributes() as $book)
         {
            $name = $book->getName();
            $library->book[$node1][$name] = $book[0];
         }
         if($node2!=($cnt-1)){
            foreach($library->book[$cnt]->children() as $book)
            {
               $name = $book->getName();
               $library->book[$node2]->$name = $book[0];
            }
            foreach($library->book[$cnt]->attributes() as $book)
            {
               $name = $book->getName();
               $library->book[$node2][$name] = $book[0];
            }
            unset($library->book[$cnt]);
         }
         else {
            unset($library->book[$cnt-1]);
         }
    }

// (ii) DOM :

// Exchange Position of 2nd book with 3rd.
    function fnDOMRearrange()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;
        $library->childNodes->item(0)->parentNode->insertBefore($library->childNodes->item(2),$library->childNodes->item(1));
        header("Content-type: text/xml");
        echo $dom->saveXML();
    }

#######################################

// (10) Display Required data in XML Form itself OR

Remove all children nodes save one OR

Copy/Clone Node Eg OR

Compare/Search non numeric data (like date or time) to get result.

#######################################

// (i) SimpleXML :

// Display Books published after 1980 in XML Form itself.
// No function to copy node directly in SimpleXML.
// Its simpler for this functionality to be implemented in DOM.
    function fnSimpleXMLDisplayElementCond()
    {
        $library = new SimpleXMLElement('library.xml',null,true);
        $book = $library->xpath('/library/book[translate(@pubdate,"-","")>translate("1980-01-01","-","")]');
        // Manually create a new structure then add searched data to it (see create xml eg.)
    }

// (ii) DOM :

// Display Books published after 1980 in XML Form itself.
    function fnDOMDisplayElementCond()
    {
        $dom = new DOMDocument();
        $dom->load('library.xml');
        $library = $dom->documentElement;
        $xpath = new DOMXPath($dom);

        // Comparing non numeric standard data
        $result = $xpath->query('/library/book[translate(@pubdate,"-","")>translate("1980-01-01","-","")]');
        // For simpler search paramater use this :
        //$result = $xpath->query('/library/book[author="J.R.R.Tolkein"]');

        // Copy only node & its attributes not its contents.
        $library = $library->cloneNode(false);
        // Add the 1 element which is search result.
        $library->appendChild($result->item(0));

        header("Content-type: text/xml");
        echo $dom->saveXML($library);
    }

Lessons Learn’t :

SimpleXML is fantastic for those who will only briefly flirt with XML (or beginners) & perform simple operations on XML.
DOM is an absolute necessity for performing complex operations on XML data. Its learning curve is higher than SimpleXML off course but once you get the hang of it , you will know its very logical.
Use XPath for conditional access to data. For serial access (like last book) XPath is not needed (but u can use it) since I can use normal DOM / SimpleXML node access.

Tuesday, December 21, 2010

Unix Terminal Color Code

Unix Color Codes:

0 Normal text, foreground and background
1 Bold text
4 Underline
5 Blink
7 Inverse

30 Black foreground
31 Red foreground
32 Green foreground
33 Yellow foreground
34 Blue foreground
35 Magenta foreground
36 Cyan foreground
37 White foreground

40 Black background
41 Red background
42 Green background
43 Yellow background
44 Blue background
45 Magenta background
46 Cyan background
47 White background

[;;m

echo "[1;33;44m Hello, world [0m"
printf "\e[1;37;41m Hello, world \e[m\n"

Print all colors:

for j in 0 1 4 5 7
do
for i in 30 31 32 33 34 35 36 37
do
echo "^[[$j;${i}m Hello ^[[0m"
done
echo
done

Result

CR/LF Issues and Text Line-endings - Perforce

SUMMARY

How does Perforce handle CR/LF issues?
How does Perforce translate text file line-endings?

DETAILS

When editing text files in cross-platform environments, you must account for different line termination conventions. Perforce can be configured to automatically translate line-endings from one operating system's convention to another, or configured to ignore line-ending translation. These configurations apply only to text files.

Platform Conventions

The following are the various line termination conventions:

On UNIX, text file line-endings are terminated with a newline character (ASCII 0x0a, represented by the \n escape sequence in most languages), also referred to as a linefeed (LF).
On Windows, line-endings are terminated with a combination of a carriage return (ASCII 0x0d or \r) and a newline(\n), also referred to as CR/LF.
On the Mac Classic (Mac systems using any system prioer to Mac OS X), line-endings are terminated with a single carriage return (\r or CR). (Mac OS X uses the UNIX convention.)

The following example files demonstrate the various line-end conventions. They are displayed using the UNIX tool od (octal dump) on a Windows machine:

D:P4WORKtest>od -c line_end.pc
0000000000 P C l i n e e n d \r \n 
0000000015  
D:P4WORKtest>od -c line_end.mac 
0000000000 M a c l i n e e n d \r 
0000000015  
D:P4WORKtest>od -c line_end.unix
 0000000000 U n i x l i n e e n d \n 
0000000016

Current Versions of Perforce

On the server side, Perforce processes all text files using Unix-style LF line-endings. Although Perforce stores server archive files on disk in the operating system's native line termination convention (CR/LF on Windows, LF on Unix), all line-endings are normalized to Unix-style LF line-endings for internal Perforce Server operations such as p4 sync, p4 submit and p4 diff.

On the client workspace side, Perforce handling of line-endings is determined by a global option for each clientspec. When you sync text files to a client workspace with p4 sync, or submit them back to a Perforce Server with p4 submit, their line-endings are converted as specified in the clientspec LineEnd section.

Beginning with the 2001.1 version of Perforce, there are five possible workspace options for handling text file line-endings. These options for line-end treatment are:

local        The use mode native to the client (default).
unix         Linefeed: UNIX style.  
mac          Carriage return: Macintosh style.  
win          Carriage return-linefeed: Windows style.  
share        Hybrid: writes UNIX style but reads UNIX, Mac, or Windows style.

The default value for all Perforce client workspaces is local, meaning that files sync to the client workspace using the client platform's standard line-ending characters. So the default LineEnd section of the clientspec would show the following:

LineEnd:    local

On UNIX and Mac OS X client workspaces, the default local setting does not cause any line-end conversion. Perforce client workspaces on UNIX store text files with LF line-endings. Because the Perforce Server uses LF line-endings for operations involving text files, there is no need to do any line-end conversion in this case.

By contrast, syncing files to a Windows or Macintosh workspace requires line-end conversion, because those operating system's native line termination formats are different from UNIX. In these cases, using the local setting converts LF to CR/LF in the Windows workspace and LF to CR in the Macintosh workspace. When files are submitted back to the Perforce Server, the line-endings are converted back to LF.

The Perforce line-end options can be used to convert your text file line endings regardless of the platform where your client workspace resides. For example, a Mac Classic user can set their client workspace line-end option to win, to sync text files to their workspace and retain Windows-style CR/LF line-endings. UNIX users can create client workspace files with Macintosh CR line termination by choosing the mac line-end option and then syncing files into their workspace.

Using the unix client workspace option on a UNIX or Mac OS X client is equivalent to using the local setting. Similarly, the local setting for a Windows workspace is equivalent to win, and the local setting for a Mac Classic workspace is equivalent to mac. Again, the local setting is equivalent to the operating system's native line termination convention.

You might have files in your workspace that have mixed line termination conventions. For example, you might work in a cross platform environment and use a text editor that can save files with multiple line-ending formats. In this case, it is possible to edit files and inadvertently save them with spurious line-end characters. For example, saving a text file with CRLF line-endings in a unix workspace and then submitting it results in the files being stored in the depot with extra CR characters at the end of each line. When these files are synced to other unix workspaces, they will have CRLF line-endings rather than the correct LF line-endings, and when these files are synced to win workspaces, they will have CRCRLF line-endings (since each LF in the original file is converted to CRLF).

Here, the share option is useful. The share option is used to "normalize" mixed line-endings into UNIX line-end format. The share option does not affect files that are synced into a client workspace, however, when files are submitted back to the Perforce Server, the share option converts all Windows-style CRLF line-endings and all Mac-style CR line-endings to the UNIX-style LF, leaving lone LFs untouched.

For more information on the current LineEnd options see the p4 client section of the command reference.

Previous Versions of Perforce (99.1 to 2000.1)

Perforce clientspecs have a single client workspace option, [no]crlf, that toggles line-ending translation on and off for all files on Windows and Macintosh clients. On UNIX clients, this setting is ignored.

The default value on both Windows and Mac Classic clients is crlf. The crlf option enables line-end translation using the operating system's default line termination convention -- CR for Mac Classic text files, CR/LF for Windows text files.

To override the default CR/LF translation behavior you set the clientspec option to nocrlf. In this case, line-end translation is ignored when files are retrieved from, or submitted to, the Perforce Server. This setting is useful in instances where you want to preserve UNIX-style line-endings in a Windows client workspace. For example, if you were using UNIX shell tools on Windows or mounting NFS drives on a Windows based machine, preserving the UNIX-style line-endings would be preferable. In these cases, your text editor is a factor. Some Windows editors only save files with CR/LF endings, while others can save files in either PC, UNIX or Mac line-end format. As an example, if your client workspace is set to ignore line-end translation (nocrlf), and your text editor saved files in Windows format (CR/LF), then your files will contain extra carriage returns when submitted back to the server. When such files are then synced to a UNIX client workspace, they contain spurious ^M (Control-M) characters at the end of lines. To avoid this, you must save text files in the correct line-end format when using the old nocrlf clientspec option.

An alternative to setting the old nocrlf option is to treat a file as type binary. This preserves whatever line termination style the file is saved with, because line-end translation is ignored for binary files. However, this configuration also disables all other Perforce text-specific features for that file, including RCS reverse-delta storage and three-way merging capability.

Previous Versions of Perforce (98.2 and earlier)

The crlf or local translation option is implicit and cannot be altered.

Note: It is possible to add text files to the Perforce repository as type binary or binary+D files in order to bypass line-ending translation for those files. If you do add text files as Perforce type binary, you will need to use the -t flag when diffing or merging in order to treat the files as text.

Monday, December 6, 2010

Bits to Bytes to Kilobytes to Megabytes to Gigabytes to Terabytes to Petabytes to Exabytes

The basic unit used in computer data storage is called a bit (binary digit). Computers use these little bits, which are composed of ones and zeros, to do things and talk to other computers. All your files, for instance, are kept in the computer as binary files and translated into words and pictures by the software (which is also ones and zeros). This two number system, is called a "binary number system" since it has only two numbers in it. The decimal number system in contrast has ten unique digits, zero through nine.

But although computer data and file size is normally measured in binary code using the binary number system (counted by factors of two 1, 2, 4, 8, 16, 32, 64, etc), the prefixes for the multiples are based on the metric system! The nearest binary number to 1,000 is 2^10 or 1,024; thus 1,024 bytes was named a Kilobyte. So, although a metric "kilo" equals 1,000 (e.g. one kilogram = 1,000 grams), a binary "Kilo" equals 1,024 (e.g. one Kilobyte = 1,024 bytes). Not surprisingly, this has led to a great deal of confusion.

In December 1998, the International Electrotechnical Commission (IEC) approved a new IEC International Standard. Instead of using the metric prefixes for multiples in binary code, the new IEC standard invented specific prefixes for binary multiples made up of only the first two letters of the metric prefixes and adding the first two letters of the word "binary". Thus, for instance, instead of Kilobyte (KB) or Gigabyte (GB), the new terms would be kibibyte (KiB) or gibibyte (GiB). The new IEC International Standards, which are not commonly used yet, are included below.

Here's a few more details to consider:

Although data storage capacity is generally expressed in binary code, many hard drive manufacturers (and some newer BIOSs) use a decimal system to express capacity.
- For example, a 30 gigabyte drive is usually 30,000,000,000 bytes (decimal) not the 32,212,254,720 binary bytes you would expect.
Another trivial point is that in the metric system the "k" or "kilo" prefix is always lowercase (i.e. kilogram = kg not Kg) but since these binary uses for data storage capacity are not properly metric, it has become standard to use an uppercase "K" for the binary form.
When used to describe Data Transfer Rate, bits/bytes are calculated as in the metric system
- Kilobits per second is usually shortened to kbps or Kbps. Although technically speaking, the term kilobit should have a lowercase initial letter, it has become common to capitalize it in abbreviation (e.g. "56 Kbps" or "56K"). The simple "K" might seem ambiguous but, in the context of data transfer, it can be assumed that the measurement is in bits rather than bytes unless indicated otherwise.

File Storage Capacity by Bits and Bytes
	bit	byte	Kilobyte	Megabyte	Gigabyte
bit	1	8	8,192	8,388,608	8,589,934,592
byte	8	1	1,024	1,048,576	1,073,741,824
Kilobyte	8,192	1,024	1	1,024	1,048,576
Megabyte	8,388,608	1,048,576	1,024	1	1,024
Gigabyte	8,589,934,592	1,073,741,824	1,048,576	1,024	1
Terabyte	8,796,093,022,208	1,099,511,627,776	1,073,741,824	1,048,576	1,024
Petabyte	9,007,199,254,740,992	1,125,899,906,842,624	1,099,511,627,776	1,073,741,824	1,048,576
Exabyte	9,223,372,036,854,775,808	1,152,921,504,606,846,976	1,125,899,906,842,624	1,099,511,627,776	1,073,741,824
Zettabyte	9,444,732,965,739,290,427,392	1,180,591,620,717,411,303,424	1,152,921,504,606,846,976	1,125,899,906,842,624	1,099,511,627,776

File Storage Capacity by Powers of Two (Base 2)
	bit	byte	Kilobyte	Megabyte	Gigabyte	Terabyte	Petabyte	Exabyte	Zettabyte	Yottabyte
bit	2^0	2^3	2^13	2^23	2^33	2^43	2^53	2^63	2^73	2^83
byte	2^3	2^0	2^10	2^20	2^30	2^40	2^50	2^60	2^70	2^80
Kilobyte	2^13	2^10	2^0	2^10	2^20	2^30	2^40	2^50	2^60	2^70
Megabyte	2^23	2^20	2^10	2^0	2^10	2^20	2^30	2^40	2^50	2^60
Gigabyte	2^33	2^30	2^20	2^10	2^0	2^10	2^20	2^30	2^40	2^50
Terabyte	2^43	2^40	2^30	2^20	2^10	2^0	2^10	2^20	2^30	2^40
Petabyte	2^53	2^50	2^40	2^30	2^20	2^10	2^0	2^10	2^20	2^30
Exabyte	2^63	2^60	2^50	2^40	2^30	2^20	2^10	2^0	2^10	2^20
Zettabyte	2^73	2^70	2^60	2^50	2^40	2^30	2^20	2^10	2^0	2^10
Yottabyte	2^83	2^80	2^70	2^60	2^50	2^40	2^30	2^20	2^10	2^0

New IEC Standard
bit	bit	0 or 1
byte	B	8 bits
kibibit	Kibit	1024 bits
kilobit	kbit	1000 bits
kibibyte (binary)	KiB	1024 bytes
kilobyte (decimal)	kB	1000 bytes
megabit	Mbit	1000 kilobits
mebibyte (binary)	MiB	1024 kibibytes
megabyte (decimal)	MB	1000 kilobytes
gigabit	Gbit	1000 megabits
gibibyte (binary)	GiB	1024 mebibytes
gigabyte (decimal)	GB	1000 megabytes
terabit	Tbit	1000 gigabits
tebibyte (binary)	TiB	1024 gibibytes
terabyte (decimal)	TB	1000 gigabytes
petabit	Pbit	1000 terabits
pebibyte (binary)	PiB	1024 tebibytes
petabyte (decimal)	PB	1000 terabytes
exabit	Ebit	1000 petabits
exbibyte (binary)	EiB	1024 pebibytes
exabyte (decimal)	EB	1000 petabytes