Thursday, March 8, 2012

"Random IV"?

I just encrypted a bunch of rows in a table. I was a little surprised
that encrypting N'A' resulted in a 44-byte result. N'AAAA' resulted in
a 52-byte result. That indicates that 8-byte padding is being used.
So the actual encrypted text of N'A' is only 8 bytes in length. What
in the world are the other 36 bytes being used for? I noticed an
interesting 'prefix' for the encrypted data:
0x0074126478AFB34BA54BA6280E2335EB010000
003AE8D337EF2E5B55BABDE251B245C1FCCD
B21C770073B642
0x0074126478AFB34BA54BA6280E2335EB010000
0057252C84D4B6C9F3FFA40444BB7E5ADC50
9233F73548C7A8
0x0074126478AFB34BA54BA6280E2335EB010000
0077113F30B10CC2A8FD976628C98C06C5F7
0E47AEEF3BE86C
It appears the first 20 bytes of every encrypted string are exactly the
same, regardless of the row, column, or data being encrypted. Further
these 20 bytes appear to be generated when the OPEN ... KEY statement
is issued. In between OPEN ... KEY statements, these first 20
characters change, but for multiple encryptions after an OPEN ... KEY
they remain the same.
These first 20 - 36 bytes wouldn't happen to be the "Random IV" would
they? One would think that storing the IV with the data would be
potentially less secure than letting the user manage their own IV...
What do you think?The first 16 prefix bytes represent the key identifier (a GUID). They are
used to determine what key should be used for decryption. They're
particularly useful if you use multiple encryption keys per column. The
decryption builtins will automatically search for the proper key and use it
to decrypt the blob.
So, if you're doing something like
select decryptbykey(SSN) from customers
it will decrpyt all SSNs encrypted by keys that you currently have opened.
The IV is indeed prefixed to the data. IV's are not meant to be secret,
they're just used to randomize the encryption output, so that patterns in
the cleartext data are not disclosed.
Thanks
Laurentiu Cristofor [MSFT]
Software Design Engineer
SQL Server Engine
http://blogs.msdn.com/lcris/
This posting is provided "AS IS" with no warranties, and confers no rights.
<coles.michael@.gmail.com> wrote in message
news:1151728397.102411.65150@.h44g2000cwa.googlegroups.com...
>I just encrypted a bunch of rows in a table. I was a little surprised
> that encrypting N'A' resulted in a 44-byte result. N'AAAA' resulted in
> a 52-byte result. That indicates that 8-byte padding is being used.
> So the actual encrypted text of N'A' is only 8 bytes in length. What
> in the world are the other 36 bytes being used for? I noticed an
> interesting 'prefix' for the encrypted data:
> 0x0074126478AFB34BA54BA6280E2335EB010000
003AE8D337EF2E5B55BABDE251B245C1FC
CDB21C770073B642
> 0x0074126478AFB34BA54BA6280E2335EB010000
0057252C84D4B6C9F3FFA40444BB7E5ADC
509233F73548C7A8
> 0x0074126478AFB34BA54BA6280E2335EB010000
0077113F30B10CC2A8FD976628C98C06C5
F70E47AEEF3BE86C
> It appears the first 20 bytes of every encrypted string are exactly the
> same, regardless of the row, column, or data being encrypted. Further
> these 20 bytes appear to be generated when the OPEN ... KEY statement
> is issued. In between OPEN ... KEY statements, these first 20
> characters change, but for multiple encryptions after an OPEN ... KEY
> they remain the same.
> These first 20 - 36 bytes wouldn't happen to be the "Random IV" would
> they? One would think that storing the IV with the data would be
> potentially less secure than letting the user manage their own IV...
> What do you think?
>|||The first 20 bytes are not part of the ciphertext, it is really some
metadata for internal use:
*The first 16 bytes is the key GUID used by the system to find the proper
key from the key ring (opened keys), if there is no key that matches the
ciphertext GUID, the operation fails immediately returning NULL.
*The next 4 bytes is an internal version number, this value is hardcoded
for now, but can be potentially used in the future to distinguish what
version of the product was used to generate any given ciphertext.
*Then we have the randomly generated IV (8 bytes in case of DES family,
16 in case of AES) followed by the rest of the ciphertext + padding (up to a
whole block of padding, again 8 or 16 bytes depending on the algorithm).
The real plaintext is not purely the plaintext given to the builtin. It
includes an 8 byte header that describes some attributes of the plaintext an
d
the presence of the authenticator parameter for the builtin.
Let’s take a look to a sample:
CREATE SYMMETRIC KEY key1 WITH ALGORITHM = TRIPLE_DES ENCRYPTION BY PASSWORD
= 'k3y_1 P@.zzw0rd'
CREATE SYMMETRIC KEY key2 WITH ALGORITHM = TRIPLE_DES ENCRYPTION BY PASSWORD
= 'k3y_1 P@.zzw0rd'
go
OPEN SYMMETRIC KEY key1 DECRYPTION BY PASSWORD = 'k3y_1 P@.zzw0rd'
OPEN SYMMETRIC KEY key2 DECRYPTION BY PASSWORD = 'k3y_1 P@.zzw0rd'
go
SELECT key_name, convert(varbinary(100), key_guid) as key_guid_bin FROM
sys.openkeys
go
CREATE TABLE key_demo( key_guid_bin varbinary(100) , ciphertext
varbinary(8000) )
go
-- 3 times the same value with each key
INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
key_guid('key1'), 'AAAAAAAA' ))
INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
key_guid('key1'), 'AAAAAAAA' ))
INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
key_guid('key1'), 'AAAAAAAA' ))
INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
key_guid('key2'), 'AAAAAAAA' ))
INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
key_guid('key2'), 'AAAAAAAA' ))
INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
key_guid('key2'), 'AAAAAAAA' ))
go
SELECT * FROM key_demo
go
Let’s take a closer look to these results. In my case, I got the following
GUID on my demo key:
0x008FBF67275CF643BF623BC0422F7F5C
And the first ciphertext:
0x008FBF67275CF643BF623BC0422F7F5C010000
000EC0844ACA2346DB0FE2BD35FF93BEA656
B2A45296CB8E000F7094C2F21D0285
Notice that the GUID is indeed the prefix of the ciphertext, then we have
“01000000”, which is the hardcoded version.
0EC0844ACA2346DB – IV
0FE2BD35FF93BEA6 – 8 byte header
56B2A45296CB8E00 – cipher text (8 bytes)
0F7094C2F21D0285 – padding (8 bytes)
Now, let’s take a quick look and compare the next row:
0x008FBF67275CF643BF623BC0422F7F5C – same GUID (same key was used to encry
pt
this plaintext)
01000000 – same version
E4BA5A5141FE5842 – IV (different, and from this point the rest of the
ciphertext will be different as well)
0E4EEB8440877430 - header – (the plaintext for both headers should be the
same as the same options were used)
33C361724D74742F
26E6829B957E877F
I hope this helped to clarify the usage of the first 20 bytes of data as
well as the other 8 bytes used for the internal use header.
-Raul Garcia
SDE/T
SQL Server Engine
________________________________________
This posting is provided "AS IS" with no warranties, and confers no rights.
"coles.michael@.gmail.com" wrote:

> I just encrypted a bunch of rows in a table. I was a little surprised
> that encrypting N'A' resulted in a 44-byte result. N'AAAA' resulted in
> a 52-byte result. That indicates that 8-byte padding is being used.
> So the actual encrypted text of N'A' is only 8 bytes in length. What
> in the world are the other 36 bytes being used for? I noticed an
> interesting 'prefix' for the encrypted data:
> 0x0074126478AFB34BA54BA6280E2335EB010000
003AE8D337EF2E5B55BABDE251B245C1FC
CDB21C770073B642
> 0x0074126478AFB34BA54BA6280E2335EB010000
0057252C84D4B6C9F3FFA40444BB7E5ADC
509233F73548C7A8
> 0x0074126478AFB34BA54BA6280E2335EB010000
0077113F30B10CC2A8FD976628C98C06C5
F70E47AEEF3BE86C
> It appears the first 20 bytes of every encrypted string are exactly the
> same, regardless of the row, column, or data being encrypted. Further
> these 20 bytes appear to be generated when the OPEN ... KEY statement
> is issued. In between OPEN ... KEY statements, these first 20
> characters change, but for multiple encryptions after an OPEN ... KEY
> they remain the same.
> These first 20 - 36 bytes wouldn't happen to be the "Random IV" would
> they? One would think that storing the IV with the data would be
> potentially less secure than letting the user manage their own IV...
> What do you think?
>|||Thank you both for the info. That is quite a bit of extra storage when
encrypting small strings, but apparently that's just the cost of doing
business. I have another question for you - is it possible to invoke your
own implementation of an algorithm from SS 2005? For instance, if I created
my own CryptoAPI CSP can the T-SQL encryption functions recognize and use
it?
"Raul Garcia [MS]" <Raul Garcia [MS]@.discussions.microsoft.com> wrot
e in
message news:F7E85812-8C39-4E46-B8C0-C9B883BA06EC@.microsoft.com...[vbcol=seagreen]
> The first 20 bytes are not part of the ciphertext, it is really some
> metadata for internal use:
> *The first 16 bytes is the key GUID used by the system to find the
> proper
> key from the key ring (opened keys), if there is no key that matches the
> ciphertext GUID, the operation fails immediately returning NULL.
> *The next 4 bytes is an internal version number, this value is hardcoded
> for now, but can be potentially used in the future to distinguish what
> version of the product was used to generate any given ciphertext.
> *Then we have the randomly generated IV (8 bytes in case of DES family,
> 16 in case of AES) followed by the rest of the ciphertext + padding (up to
> a
> whole block of padding, again 8 or 16 bytes depending on the algorithm).
> The real plaintext is not purely the plaintext given to the builtin. It
> includes an 8 byte header that describes some attributes of the plaintext
> and
> the presence of the authenticator parameter for the builtin.
> Let's take a look to a sample:
> CREATE SYMMETRIC KEY key1 WITH ALGORITHM = TRIPLE_DES ENCRYPTION BY
> PASSWORD
> = 'k3y_1 P@.zzw0rd'
> CREATE SYMMETRIC KEY key2 WITH ALGORITHM = TRIPLE_DES ENCRYPTION BY
> PASSWORD
> = 'k3y_1 P@.zzw0rd'
> go
> OPEN SYMMETRIC KEY key1 DECRYPTION BY PASSWORD = 'k3y_1 P@.zzw0rd'
> OPEN SYMMETRIC KEY key2 DECRYPTION BY PASSWORD = 'k3y_1 P@.zzw0rd'
> go
> SELECT key_name, convert(varbinary(100), key_guid) as key_guid_bin FROM
> sys.openkeys
> go
> CREATE TABLE key_demo( key_guid_bin varbinary(100) , ciphertext
> varbinary(8000) )
> go
> -- 3 times the same value with each key
> INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
> key_guid('key1'), 'AAAAAAAA' ))
> INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
> key_guid('key1'), 'AAAAAAAA' ))
> INSERT INTO key_demo VALUES( key_guid('key1'), EncryptByKey(
> key_guid('key1'), 'AAAAAAAA' ))
> INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
> key_guid('key2'), 'AAAAAAAA' ))
> INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
> key_guid('key2'), 'AAAAAAAA' ))
> INSERT INTO key_demo VALUES( key_guid('key2'), EncryptByKey(
> key_guid('key2'), 'AAAAAAAA' ))
> go
> SELECT * FROM key_demo
> go
> Let's take a closer look to these results. In my case, I got the following
> GUID on my demo key:
> 0x008FBF67275CF643BF623BC0422F7F5C
> And the first ciphertext:
> 0x008FBF67275CF643BF623BC0422F7F5C010000
000EC0844ACA2346DB0FE2BD35FF93BEA6
56B2A45296CB8E000F7094C2F21D0285
> Notice that the GUID is indeed the prefix of the ciphertext, then we have
> "01000000", which is the hardcoded version.
> 0EC0844ACA2346DB - IV
> 0FE2BD35FF93BEA6 - 8 byte header
> 56B2A45296CB8E00 - cipher text (8 bytes)
> 0F7094C2F21D0285 - padding (8 bytes)
> Now, let's take a quick look and compare the next row:
> 0x008FBF67275CF643BF623BC0422F7F5C - same GUID (same key was used to
> encrypt
> this plaintext)
> 01000000 - same version
> E4BA5A5141FE5842 - IV (different, and from this point the rest of the
> ciphertext will be different as well)
> 0E4EEB8440877430 - header - (the plaintext for both headers should be the
> same as the same options were used)
> 33C361724D74742F
> 26E6829B957E877F
> I hope this helped to clarify the usage of the first 20 bytes of data as
> well as the other 8 bytes used for the internal use header.
> -Raul Garcia
> SDE/T
> SQL Server Engine
> ________________________________________
> This posting is provided "AS IS" with no warranties, and confers no
> rights.
>
> "coles.michael@.gmail.com" wrote:
>|||No, right now, you cannot specify the CSP to be used for encryption. We're
looking at adding this capability in future versions of SQL Server.
Thanks
Laurentiu Cristofor [MSFT]
Software Design Engineer
SQL Server Engine
http://blogs.msdn.com/lcris/
This posting is provided "AS IS" with no warranties, and confers no rights.
"Mike C#" <xyz@.xyz.com> wrote in message
news:u43EzXQoGHA.964@.TK2MSFTNGP05.phx.gbl...
> Thank you both for the info. That is quite a bit of extra storage when
> encrypting small strings, but apparently that's just the cost of doing
> business. I have another question for you - is it possible to invoke your
> own implementation of an algorithm from SS 2005? For instance, if I
> created my own CryptoAPI CSP can the T-SQL encryption functions recognize
> and use it?
>
> "Raul Garcia [MS]" <Raul Garcia [MS]@.discussions.microsoft.com> wr
ote in
> message news:F7E85812-8C39-4E46-B8C0-C9B883BA06EC@.microsoft.com...
>

No comments:

Post a Comment