Core Function Pack

From Sputnik Wiki
(Difference between revisions)
Jump to: navigation, search
(Example)
(Example)
Line 76: Line 76:
  
 
=== Example ===
 
=== Example ===
 +
 +
Note about the format A and a
 +
 +
<syntaxhighlight lang="sputnik">
 +
# It will write what the string contains
 +
# if it was converted to bytes so...
 +
# If the string contains only ASCII it will
 +
# write ASCII bytes. However if the string
 +
# contains just ONE yes ONE Unicode character
 +
# it will write Unicode bytes example
 +
 +
my $vec = pack("A*", "Hello");
 +
printr $vec;
 +
# Prints
 +
# {BINARY:5}
 +
# {
 +
#        [0] => 72
 +
#        [1] => 101
 +
#        [2] => 108
 +
#        [3] => 108
 +
#        [4] => 111
 +
# }
 +
# Just ASCII as expected
 +
 +
# Lets add just one Unicode character
 +
# and try again
 +
my $vec = pack("A*", "Helloキ");
 +
printr $vec;
 +
# {BINARY:8}
 +
# {
 +
#        [0] => 72
 +
#        [1] => 101
 +
#        [2] => 108
 +
#        [3] => 108
 +
#        [4] => 111
 +
#        [5] => 227
 +
#        [6] => 130
 +
#        [7] => 173
 +
# }
 +
# VERY different output and if you tried
 +
# to read this as ASCII it wouldn't go well.
 +
 +
# This is because by normal a string is made of
 +
# bytes where every second byte is 0x00 if the
 +
# character is ASCII.
 +
 +
# If one of the strings second bytes is not 0x00
 +
# then it is considered to be Unicode and it will
 +
# print as such
 +
 +
# There is no need to worry about functions accidently
 +
# changing a string to Unicode when read/write binary to
 +
# the string since the binary can only ever modify the first
 +
# byte (range 0-255) and NOT the second byte.
 +
 +
# So even on a large binary string where you placed a crap ton
 +
# of bytes into a string it will be considered ASCII since every
 +
# second byte of the string will be 0x00 however that byte you will
 +
# never ever see unless you write a number higher than 255 into one
 +
# of the strings [position].
 +
</syntaxhighlight>
  
 
Reading and writing ASCII
 
Reading and writing ASCII
 +
<syntaxhighlight lang="sputnik">
 +
my $vec = pack("A*", "Hello");
 +
printr $vec;
 +
printr unpack("A*", $vec, 3);
 +
# Prints
 +
# {BINARY:5}
 +
# {
 +
#        [0] => 72
 +
#        [1] => 101
 +
#        [2] => 108
 +
#        [3] => 108
 +
#        [4] => 111
 +
# }
 +
# Hello
 +
</syntaxhighlight>
  
 
<syntaxhighlight lang="sputnik">
 
<syntaxhighlight lang="sputnik">

Revision as of 21:18, 26 September 2013

Pack( <format>, <args> )

Contents

Description

Pack data into a binary array.

Pack given arguments into a binary string according to format.

Parameters

format

The format string consists of format codes followed by an optional repeater argument. The repeater argument can be either an integer value or * for repeating to the end of the input data.

For a, A, b, B, h, H the repeat count specifies how many characters of one data argument are taken, for @ it is the absolute position where to put the next data, for everything else the repeat count specifies how many data arguments are consumed and packed into the resulting binary string.

You may place spaces in the format string and they will be stripped automatically so you can use them to make things more readable.

Currently implemented formats are:

Code 	Description
a 	NUL-padded string
A 	SPACE-padded string
b 	A bit string (ascending bit order inside each byte, like the Vec() function)
B 	A bit string (descending bit order inside each byte)
h 	Hex string, low nibble first
H 	Hex string, high nibble first
c	signed ASCII char
C 	unsigned ASCII char
U 	UNICODE char
s 	signed short (always 16 bit, machine byte order)
S 	unsigned short (always 16 bit, machine byte order)
n 	unsigned short (always 16 bit, big endian byte order)
v 	unsigned short (always 16 bit, little endian byte order)
i 	signed integer (machine dependent size and byte order)
I 	unsigned integer (machine dependent size and byte order)
l 	signed long (always 32 bit, machine byte order)
L 	unsigned long (always 32 bit, machine byte order)
q 	signed quad (64-bit) value (always 64 bit, machine byte order)
Q 	unsigned quad (64-bit) value (always 64 bit, machine byte order)
N 	unsigned long (always 32 bit, big endian byte order)
V 	unsigned long (always 32 bit, little endian byte order)
f 	float (machine dependent size and representation)
d 	double (machine dependent size and representation)
x 	NUL byte
X 	Back up one byte
@ 	NUL-fill to absolute position

args

One or more variables to be used in the packing.

Return Value

Success: Returns the new binary variable.

Failure: Returns null.

Remarks

Note that Sputnik stores variables as signed/unsigned as needed which can be confusing if you are a PHP user using this Pack function basically when the format says Unsigned it is literal and the variable will be unsigned/signed as specified both the packing and unpacking.

This also means if the original value as an Int64 and you packed it with "i" then unpack it with "i" the new value will be an Int32 and not an Int64 since Sputnik sees no need to magically unpack everything to highest data type so "f" will actually give you a Float and not a Double and so on of course that doesn't mean you can't cast it to a double when getting from the return array.

Be aware that if you do not name an element, an empty string is used. If you do not name more than one element, this means that some data is overwritten as the keys are the same

Warning: when packing multiple pieces of data, * only means "consume all of the current piece of data". That's to say

print pack("A*A*", $one, $two);

packs all of $one into the first A* and then all of $two into the second. This is a general principle: each format character corresponds to one piece of data to be packed.

Example

Note about the format A and a

# It will write what the string contains
# if it was converted to bytes so...
# If the string contains only ASCII it will
# write ASCII bytes. However if the string
# contains just ONE yes ONE Unicode character
# it will write Unicode bytes example
 
my $vec = pack("A*", "Hello");
printr $vec;
# Prints
# {BINARY:5}
# {
#         [0] => 72
#         [1] => 101
#         [2] => 108
#         [3] => 108
#         [4] => 111
# }
# Just ASCII as expected
 
# Lets add just one Unicode character
# and try again
my $vec = pack("A*", "Helloキ");
printr $vec;
# {BINARY:8}
# {
#         [0] => 72
#         [1] => 101
#         [2] => 108
#         [3] => 108
#         [4] => 111
#         [5] => 227
#         [6] => 130
#         [7] => 173
# }
# VERY different output and if you tried
# to read this as ASCII it wouldn't go well.
 
# This is because by normal a string is made of
# bytes where every second byte is 0x00 if the
# character is ASCII.
 
# If one of the strings second bytes is not 0x00
# then it is considered to be Unicode and it will
# print as such
 
# There is no need to worry about functions accidently
# changing a string to Unicode when read/write binary to
# the string since the binary can only ever modify the first
# byte (range 0-255) and NOT the second byte.
 
# So even on a large binary string where you placed a crap ton
# of bytes into a string it will be considered ASCII since every
# second byte of the string will be 0x00 however that byte you will
# never ever see unless you write a number higher than 255 into one
# of the strings [position].

Reading and writing ASCII

my $vec = pack("A*", "Hello");
printr $vec;
printr unpack("A*", $vec, 3);
# Prints
# {BINARY:5}
# {
#         [0] => 72
#         [1] => 101
#         [2] => 108
#         [3] => 108
#         [4] => 111
# }
# Hello
my $vec = pack("A*", "Hello");
printr $vec;
printr unpack("A*", $vec, 3);
# Prints
# {BINARY:5}
# {
#         [0] => 72
#         [1] => 101
#         [2] => 108
#         [3] => 108
#         [4] => 111
# }
# Hello

Reading and writing ASCII in bits

my $vec = pack("A*", "Hello");
my $chars = unpack("A/A/A2", $vec, 3);
print( "First letter: " . $chars[0] . "\n" );
print( "Second letter: " . $chars[1] . "\n" );
print( "Third and forth letter: " . $chars[2] . "\n" );
# Prints
# First letter: H
# Second letter: e
# Third and forth letter: ll

Reading and writing UNICODE

my $vec = pack("A*", "ハローキティ");
printr $vec;
MsgBox unpack("A*", $vec, 3);
# MsgBox content
# ハローキティ

Reading and writing UNICODE in bits

# Because unicode uses more than ONE byte per character
# we have to account for the differance with
# +3 to repeater
my $vec = pack("A*", "ハローキティ");
my $chars = unpack("A3/A3/A6", $vec, 3);
my $str = 
"First letter: " . $chars[0] . @CRLF .
"Second letter: " . $chars[1] . @CRLF .
"Third and forth letter: " . $chars[2] . @CRLF ;
MsgBox($str);
# MsgBox content
# First letter: ハ
# Second letter: ロ
# Third and forth letter: ーキ
 
# Basically this shows you should KNOW what data you are handling
# you cant expect Sputnik to just assume the data type.
# Yes sure Sputnik could guess the data type and probably be accurate
# most of the time but data you send into Unpack() can be anything
# so there is no guarantee therefor you must explicitly request the type
# you know it to be.

Using Pack() to format ASCII text in a readable way with everything correctly spaced out

// Make an array of people
my $People = array(array("Name", "ID", "Birth", "Sex", "Race"));
push($People, array("Mike", 0, "1980", "M", "German"));
push($People, array("Tommy", 1, "1985", "M", "French"));
push($People, array("David", 2, "1950", "M", "Jewish"));
push($People, array("Sukara", 3, "1990", "F", "Japanese"));
push($People, array("Mary-Kate", 4, "1987", "F", "American"));
 
// Print them all in a nice formatted way
foreach($People as $Person)
{
	my List ( $Name, $ID, $Birth, $Sex, $Race ) = $Person;
	my $Sorted = pack("A12 A4 A7 A5 A*", $Name, $ID, $Birth, $Sex, $Race);
	echo "$Sorted\n";
}
# Prints
# Name        ID  Birth  Sex  Race
# Mike        0   1980   M    German
# Tommy       1   1985   M    French
# David       2   1950   M    Jewish
# Sukara      3   1990   F    Japanese
# Mary-Kate   4   1987   F    American

Using Unpack() to read from formatted ASCII text and extract information from it similar to Regex but much more reliable for this specific task

# Lets imagine this variable was loaded from a file
# It contains information that is using specific spaces
# to keep it organized so it can be displayed on the
# company computers properly
# We are going to read this information and process it
# You could use regexp or splitting but they would be
# impractical so lets use Unpack() instead
my $data = 
"
Date      |Description                | Income|Expenditure
01/24/2001 Zed's Camel Emporium                    1147.99
01/28/2001 Flea spray                              24.99
01/29/2001 Camel rides to tourists      235.00
01/29/2001 Tourist camel feedings       100.10
";
 
my $TotalIncome = 0.0;
my $TotalExpenditure = 0.0;
foreach(Lines($data) as $line)
{
	if(isEmptyOrNull(trim($line)))
		continue; # Skip useless entries
	my List( $date, $desc, $income, $expend  ) = unpack("A10/x/A27/x/A7/A*", $line);
	$income = trim($income); # Trim it so we can do arithmetics on it
	$expend = trim($expend); # Trim it so we can do arithmetics on it
	$TotalIncome += $income;
	$TotalExpenditure += $expend;
	# Print it just so prove we are reading it correctly other than that
	# it is pointless
	echo "Log: '$date' | '$desc' | '$income' | '$expend'\n";
}
echo "Total income is: $TotalIncome\n";
echo "Total expenditure is: $TotalExpenditure\n";
# Prints:
# Log: 'Date' | 'Description' | 'Income' | '|Expenditure'
# Log: '01/24/2001' | 'Zed's Camel Emporium' | '' | '1147.99'
# Log: '01/28/2001' | 'Flea spray' | '' | '24.99'
# Log: '01/29/2001' | 'Camel rides to tourists' | '235.00' | ''
# Log: '01/29/2001' | 'Tourist camel feedings' | '100.10' | ''
# Total income is: 335.1
# Total expenditure is: 1172.98

Example of choosing names for KEY in the return array

$binarydata = "\x04\x00\xa0\x00";
$array = Unpack("cchars/nint", $binarydata);
// The resulting array will contain the entries
// "chars" with value 4 and "int" with 160. 
printr $array;
// Prints:
// ARRAY
// {
//         [chars] => 4
//         [int] => 194
// }

Same as above

$binarydata = "\x04\x00\xa0\x00";
$array = Unpack("c2chars/nint", $binarydata);
// The resulting array will contain the entries
// "chars1", "chars2" and "int". 
printr $array;
// Prints:
// ARRAY
// {
//         [chars1] => 4
//         [chars2] => 0
//         [int] => 49824
// }

Packing a bunch of values at once

$binarydata = Pack("nvc*", 0x1234, 0x5678, 65, 66);
// The resulting binary string will be 6 bytes long
// and contain the byte sequence 0x12, 0x34, 0x78, 0x56, 0x41, 0x42. 
printr $binarydata;
// Prints:
// {BINARY:6}
// {
//         [0] => 18
//         [1] => 52
//         [2] => 120
//         [3] => 86
//         [4] => 65
//         [5] => 66
// }

Using * to pack all the remaining objects with the same specifier

printr pack("C*",80,72,80);

String to Hex and back again

function H2Str( $hex ) 
{
	return pack('H*', $hex);
}
 
function Str2H( $str )
{
	return unpack('H*', $str, true);
}
$txt = 'This is test';
$hex = Str2H( $txt );
$str = H2Str( $hex );
echo "${txt} => ${hex} => ${str}\n";

Display the ASCII character codes for an entire string

echo join (unpack('C*', 'abcdef'), ' ');
// 97 98 99 100 101 102

Display the UNICODE character codes for an entire string

echo join (unpack('U*', 'こんにちは'), ' ');
// 33251 58259 37762 33251 58283 41345 33251

Convert a string into a binary array and back again:

$arr = BinaryFromStr("Hello World!");
foreach ($arr as $i)
{
	println($i);
}
$str = Unpack("A*", $arr, true);
println($str);

Convert a double into a binary array and back again:

$arr = Pack("d", 777.42);
foreach ($arr as $i)
{
	println($i);
}
$str = Unpack("d", $arr, 3);
println($str);

Convert an int into a binary array and back again:

$arr = Pack("i", (int)777);
foreach ($arr as $i)
{
	println($i);
}
$str = Unpack("i", $arr, 3);
printr($str);

Convert a string into a hex and back again:

$str = "Hello World!";
println("Original String: " . $str);
$hex = Unpack("H*", $str, true);
println("Hex String: " . $hex);
$strf = Pack("H*", $hex);
println("Normal String: " . $strf);

Convert a string into a hex and back again (using a bit of Regexp too)

Function StrToHex ( $Str )  
{
	$Str = unpack('H*', $Str, 3);
	# Operationally convert the hex characters
	# to upper case
	$Str =~ tr/a-z/A-Z/;
	return $Str;
}
Function HexToStr ( $Hex )
{
	$Hex =~ s/([\dA-Fa-f][\dA-Fa-f])/pack("C", dec($1))/eg;
	return $Hex;
}
my $Hex = StrToHex("Hello world!");
echo "Hex: $Hex\n";
my $Str = HexToStr($Hex);
echo "Str: $Str\n";
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox