Monday, December 01, 2008

Version of System.Web.HttpUtility for .NET Client Profile

The .NET Framework Client Profile released with .NET 3.5 SP1 defines a stripped-down version of the .NET Framework for distribution with rich-client applications. This is a valuable feature for developers who need to keep their software distributions small and convenient to install. (For more information, see my previous post comparing bootstrap install times for various .NET framework versions.)

The Client Profile originally included the System.Web assembly, but this was removed from the final service pack release. This is unfortunate because System.Web includes several utility operations that are useful in rich client applications. One of these is URL encoding and decoding. It would be easy enough to code up a replacement function, but why bother? Instead I've decompiled the latest version of System.Web.HttpUtility (v2.0.50727) and stripped out the functions that depend on System.Web. All of the URL encoding methods are intact. The code is below--use at your own risk.

   1: using System.Text;
   2:  
   3: namespace System.Web
   4: {
   5:     public sealed class HttpUtility
   6:     {
   7:         private static int HexToInt(char h)
   8:         {
   9:             if ((h >= '0') && (h <= '9'))
  10:             {
  11:                 return (h - '0');
  12:             }
  13:             if ((h >= 'a') && (h <= 'f'))
  14:             {
  15:                 return ((h - 'a') + 10);
  16:             }
  17:             if ((h >= 'A') && (h <= 'F'))
  18:             {
  19:                 return ((h - 'A') + 10);
  20:             }
  21:             return -1;
  22:         }
  23:  
  24:         internal static char IntToHex(int n)
  25:         {
  26:             if (n <= 9)
  27:             {
  28:                 return (char) (n + 0x30);
  29:             }
  30:             return (char) ((n - 10) + 0x61);
  31:         }
  32:  
  33:         private static bool IsNonAsciiByte(byte b)
  34:         {
  35:             if (b < 0x7f)
  36:             {
  37:                 return (b < 0x20);
  38:             }
  39:             return true;
  40:         }
  41:  
  42:         internal static bool IsSafe(char ch)
  43:         {
  44:             if ((((ch >= 'a') && (ch <= 'z')) || ((ch >= 'A') && (ch <= 'Z'))) || ((ch >= '0') && (ch <= '9')))
  45:             {
  46:                 return true;
  47:             }
  48:             switch (ch)
  49:             {
  50:                 case '\'':
  51:                 case '(':
  52:                 case ')':
  53:                 case '*':
  54:                 case '-':
  55:                 case '.':
  56:                 case '_':
  57:                 case '!':
  58:                     return true;
  59:             }
  60:             return false;
  61:         }
  62:  
  63:         public static string UrlDecode(string str)
  64:         {
  65:             if (str == null)
  66:             {
  67:                 return null;
  68:             }
  69:             return UrlDecode(str, Encoding.UTF8);
  70:         }
  71:  
  72:         public static string UrlDecode(byte[] bytes, Encoding e)
  73:         {
  74:             if (bytes == null)
  75:             {
  76:                 return null;
  77:             }
  78:             return UrlDecode(bytes, 0, bytes.Length, e);
  79:         }
  80:  
  81:         public static string UrlDecode(string str, Encoding e)
  82:         {
  83:             if (str == null)
  84:             {
  85:                 return null;
  86:             }
  87:             return UrlDecodeStringFromStringInternal(str, e);
  88:         }
  89:  
  90:         public static string UrlDecode(byte[] bytes, int offset, int count, Encoding e)
  91:         {
  92:             if ((bytes == null) && (count == 0))
  93:             {
  94:                 return null;
  95:             }
  96:             if (bytes == null)
  97:             {
  98:                 throw new ArgumentNullException("bytes");
  99:             }
 100:             if ((offset < 0) || (offset > bytes.Length))
 101:             {
 102:                 throw new ArgumentOutOfRangeException("offset");
 103:             }
 104:             if ((count < 0) || ((offset + count) > bytes.Length))
 105:             {
 106:                 throw new ArgumentOutOfRangeException("count");
 107:             }
 108:             return UrlDecodeStringFromBytesInternal(bytes, offset, count, e);
 109:         }
 110:  
 111:         private static byte[] UrlDecodeBytesFromBytesInternal(byte[] buf, int offset, int count)
 112:         {
 113:             int length = 0;
 114:             byte[] sourceArray = new byte[count];
 115:             for (int i = 0; i < count; i++)
 116:             {
 117:                 int index = offset + i;
 118:                 byte num4 = buf[index];
 119:                 if (num4 == 0x2b)
 120:                 {
 121:                     num4 = 0x20;
 122:                 }
 123:                 else if ((num4 == 0x25) && (i < (count - 2)))
 124:                 {
 125:                     int num5 = HexToInt((char) buf[index + 1]);
 126:                     int num6 = HexToInt((char) buf[index + 2]);
 127:                     if ((num5 >= 0) && (num6 >= 0))
 128:                     {
 129:                         num4 = (byte) ((num5 << 4) | num6);
 130:                         i += 2;
 131:                     }
 132:                 }
 133:                 sourceArray[length++] = num4;
 134:             }
 135:             if (length < sourceArray.Length)
 136:             {
 137:                 byte[] destinationArray = new byte[length];
 138:                 Array.Copy(sourceArray, destinationArray, length);
 139:                 sourceArray = destinationArray;
 140:             }
 141:             return sourceArray;
 142:         }
 143:  
 144:         private static string UrlDecodeStringFromBytesInternal(byte[] buf, int offset, int count, Encoding e)
 145:         {
 146:             UrlDecoder decoder = new UrlDecoder(count, e);
 147:             for (int i = 0; i < count; i++)
 148:             {
 149:                 int index = offset + i;
 150:                 byte b = buf[index];
 151:                 if (b == 0x2b)
 152:                 {
 153:                     b = 0x20;
 154:                 }
 155:                 else if ((b == 0x25) && (i < (count - 2)))
 156:                 {
 157:                     if ((buf[index + 1] == 0x75) && (i < (count - 5)))
 158:                     {
 159:                         int num4 = HexToInt((char) buf[index + 2]);
 160:                         int num5 = HexToInt((char) buf[index + 3]);
 161:                         int num6 = HexToInt((char) buf[index + 4]);
 162:                         int num7 = HexToInt((char) buf[index + 5]);
 163:                         if (((num4 < 0) || (num5 < 0)) || ((num6 < 0) || (num7 < 0)))
 164:                         {
 165:                             goto Label_00DA;
 166:                         }
 167:                         char ch = (char) ((((num4 << 12) | (num5 << 8)) | (num6 << 4)) | num7);
 168:                         i += 5;
 169:                         decoder.AddChar(ch);
 170:                         continue;
 171:                     }
 172:                     int num8 = HexToInt((char) buf[index + 1]);
 173:                     int num9 = HexToInt((char) buf[index + 2]);
 174:                     if ((num8 >= 0) && (num9 >= 0))
 175:                     {
 176:                         b = (byte) ((num8 << 4) | num9);
 177:                         i += 2;
 178:                     }
 179:                 }
 180:                 Label_00DA:
 181:                 decoder.AddByte(b);
 182:             }
 183:             return decoder.GetString();
 184:         }
 185:  
 186:         private static string UrlDecodeStringFromStringInternal(string s, Encoding e)
 187:         {
 188:             int length = s.Length;
 189:             UrlDecoder decoder = new UrlDecoder(length, e);
 190:             for (int i = 0; i < length; i++)
 191:             {
 192:                 char ch = s[i];
 193:                 if (ch == '+')
 194:                 {
 195:                     ch = ' ';
 196:                 }
 197:                 else if ((ch == '%') && (i < (length - 2)))
 198:                 {
 199:                     if ((s[i + 1] == 'u') && (i < (length - 5)))
 200:                     {
 201:                         int num3 = HexToInt(s[i + 2]);
 202:                         int num4 = HexToInt(s[i + 3]);
 203:                         int num5 = HexToInt(s[i + 4]);
 204:                         int num6 = HexToInt(s[i + 5]);
 205:                         if (((num3 < 0) || (num4 < 0)) || ((num5 < 0) || (num6 < 0)))
 206:                         {
 207:                             goto Label_0106;
 208:                         }
 209:                         ch = (char) ((((num3 << 12) | (num4 << 8)) | (num5 << 4)) | num6);
 210:                         i += 5;
 211:                         decoder.AddChar(ch);
 212:                         continue;
 213:                     }
 214:                     int num7 = HexToInt(s[i + 1]);
 215:                     int num8 = HexToInt(s[i + 2]);
 216:                     if ((num7 >= 0) && (num8 >= 0))
 217:                     {
 218:                         byte b = (byte) ((num7 << 4) | num8);
 219:                         i += 2;
 220:                         decoder.AddByte(b);
 221:                         continue;
 222:                     }
 223:                 }
 224:                 Label_0106:
 225:                 if ((ch & 0xff80) == 0)
 226:                 {
 227:                     decoder.AddByte((byte) ch);
 228:                 }
 229:                 else
 230:                 {
 231:                     decoder.AddChar(ch);
 232:                 }
 233:             }
 234:             return decoder.GetString();
 235:         }
 236:  
 237:         public static byte[] UrlDecodeToBytes(byte[] bytes)
 238:         {
 239:             if (bytes == null)
 240:             {
 241:                 return null;
 242:             }
 243:             return UrlDecodeToBytes(bytes, 0, (bytes != null) ? bytes.Length : 0);
 244:         }
 245:  
 246:         public static byte[] UrlDecodeToBytes(string str)
 247:         {
 248:             if (str == null)
 249:             {
 250:                 return null;
 251:             }
 252:             return UrlDecodeToBytes(str, Encoding.UTF8);
 253:         }
 254:  
 255:         public static byte[] UrlDecodeToBytes(string str, Encoding e)
 256:         {
 257:             if (str == null)
 258:             {
 259:                 return null;
 260:             }
 261:             return UrlDecodeToBytes(e.GetBytes(str));
 262:         }
 263:  
 264:         public static byte[] UrlDecodeToBytes(byte[] bytes, int offset, int count)
 265:         {
 266:             if ((bytes == null) && (count == 0))
 267:             {
 268:                 return null;
 269:             }
 270:             if (bytes == null)
 271:             {
 272:                 throw new ArgumentNullException("bytes");
 273:             }
 274:             if ((offset < 0) || (offset > bytes.Length))
 275:             {
 276:                 throw new ArgumentOutOfRangeException("offset");
 277:             }
 278:             if ((count < 0) || ((offset + count) > bytes.Length))
 279:             {
 280:                 throw new ArgumentOutOfRangeException("count");
 281:             }
 282:             return UrlDecodeBytesFromBytesInternal(bytes, offset, count);
 283:         }
 284:  
 285:         public static string UrlEncode(byte[] bytes)
 286:         {
 287:             if (bytes == null)
 288:             {
 289:                 return null;
 290:             }
 291:             return Encoding.ASCII.GetString(UrlEncodeToBytes(bytes));
 292:         }
 293:  
 294:         public static string UrlEncode(string str)
 295:         {
 296:             if (str == null)
 297:             {
 298:                 return null;
 299:             }
 300:             return UrlEncode(str, Encoding.UTF8);
 301:         }
 302:  
 303:         public static string UrlEncode(string str, Encoding e)
 304:         {
 305:             if (str == null)
 306:             {
 307:                 return null;
 308:             }
 309:             return Encoding.ASCII.GetString(UrlEncodeToBytes(str, e));
 310:         }
 311:  
 312:         public static string UrlEncode(byte[] bytes, int offset, int count)
 313:         {
 314:             if (bytes == null)
 315:             {
 316:                 return null;
 317:             }
 318:             return Encoding.ASCII.GetString(UrlEncodeToBytes(bytes, offset, count));
 319:         }
 320:  
 321:         private static byte[] UrlEncodeBytesToBytesInternal(byte[] bytes, int offset, int count,
 322:                                                             bool alwaysCreateReturnValue)
 323:         {
 324:             int num = 0;
 325:             int num2 = 0;
 326:             for (int i = 0; i < count; i++)
 327:             {
 328:                 char ch = (char) bytes[offset + i];
 329:                 if (ch == ' ')
 330:                 {
 331:                     num++;
 332:                 }
 333:                 else if (!IsSafe(ch))
 334:                 {
 335:                     num2++;
 336:                 }
 337:             }
 338:             if ((!alwaysCreateReturnValue && (num == 0)) && (num2 == 0))
 339:             {
 340:                 return bytes;
 341:             }
 342:             byte[] buffer = new byte[count + (num2*2)];
 343:             int num4 = 0;
 344:             for (int j = 0; j < count; j++)
 345:             {
 346:                 byte num6 = bytes[offset + j];
 347:                 char ch2 = (char) num6;
 348:                 if (IsSafe(ch2))
 349:                 {
 350:                     buffer[num4++] = num6;
 351:                 }
 352:                 else if (ch2 == ' ')
 353:                 {
 354:                     buffer[num4++] = 0x2b;
 355:                 }
 356:                 else
 357:                 {
 358:                     buffer[num4++] = 0x25;
 359:                     buffer[num4++] = (byte) IntToHex((num6 >> 4) & 15);
 360:                     buffer[num4++] = (byte) IntToHex(num6 & 15);
 361:                 }
 362:             }
 363:             return buffer;
 364:         }
 365:  
 366:         private static byte[] UrlEncodeBytesToBytesInternalNonAscii(byte[] bytes, int offset, int count,
 367:                                                                     bool alwaysCreateReturnValue)
 368:         {
 369:             int num = 0;
 370:             for (int i = 0; i < count; i++)
 371:             {
 372:                 if (IsNonAsciiByte(bytes[offset + i]))
 373:                 {
 374:                     num++;
 375:                 }
 376:             }
 377:             if (!alwaysCreateReturnValue && (num == 0))
 378:             {
 379:                 return bytes;
 380:             }
 381:             byte[] buffer = new byte[count + (num*2)];
 382:             int num3 = 0;
 383:             for (int j = 0; j < count; j++)
 384:             {
 385:                 byte b = bytes[offset + j];
 386:                 if (IsNonAsciiByte(b))
 387:                 {
 388:                     buffer[num3++] = 0x25;
 389:                     buffer[num3++] = (byte) IntToHex((b >> 4) & 15);
 390:                     buffer[num3++] = (byte) IntToHex(b & 15);
 391:                 }
 392:                 else
 393:                 {
 394:                     buffer[num3++] = b;
 395:                 }
 396:             }
 397:             return buffer;
 398:         }
 399:  
 400:         internal static string UrlEncodeNonAscii(string str, Encoding e)
 401:         {
 402:             if (string.IsNullOrEmpty(str))
 403:             {
 404:                 return str;
 405:             }
 406:             if (e == null)
 407:             {
 408:                 e = Encoding.UTF8;
 409:             }
 410:             byte[] bytes = e.GetBytes(str);
 411:             bytes = UrlEncodeBytesToBytesInternalNonAscii(bytes, 0, bytes.Length, false);
 412:             return Encoding.ASCII.GetString(bytes);
 413:         }
 414:  
 415:         internal static string UrlEncodeSpaces(string str)
 416:         {
 417:             if ((str != null) && (str.IndexOf(' ') >= 0))
 418:             {
 419:                 str = str.Replace(" ", "%20");
 420:             }
 421:             return str;
 422:         }
 423:  
 424:         public static byte[] UrlEncodeToBytes(string str)
 425:         {
 426:             if (str == null)
 427:             {
 428:                 return null;
 429:             }
 430:             return UrlEncodeToBytes(str, Encoding.UTF8);
 431:         }
 432:  
 433:         public static byte[] UrlEncodeToBytes(byte[] bytes)
 434:         {
 435:             if (bytes == null)
 436:             {
 437:                 return null;
 438:             }
 439:             return UrlEncodeToBytes(bytes, 0, bytes.Length);
 440:         }
 441:  
 442:         public static byte[] UrlEncodeToBytes(string str, Encoding e)
 443:         {
 444:             if (str == null)
 445:             {
 446:                 return null;
 447:             }
 448:             byte[] bytes = e.GetBytes(str);
 449:             return UrlEncodeBytesToBytesInternal(bytes, 0, bytes.Length, false);
 450:         }
 451:  
 452:         public static byte[] UrlEncodeToBytes(byte[] bytes, int offset, int count)
 453:         {
 454:             if ((bytes == null) && (count == 0))
 455:             {
 456:                 return null;
 457:             }
 458:             if (bytes == null)
 459:             {
 460:                 throw new ArgumentNullException("bytes");
 461:             }
 462:             if ((offset < 0) || (offset > bytes.Length))
 463:             {
 464:                 throw new ArgumentOutOfRangeException("offset");
 465:             }
 466:             if ((count < 0) || ((offset + count) > bytes.Length))
 467:             {
 468:                 throw new ArgumentOutOfRangeException("count");
 469:             }
 470:             return UrlEncodeBytesToBytesInternal(bytes, offset, count, true);
 471:         }
 472:  
 473:         public static string UrlEncodeUnicode(string str)
 474:         {
 475:             if (str == null)
 476:             {
 477:                 return null;
 478:             }
 479:             return UrlEncodeUnicodeStringToStringInternal(str, false);
 480:         }
 481:  
 482:         private static string UrlEncodeUnicodeStringToStringInternal(string s, bool ignoreAscii)
 483:         {
 484:             int length = s.Length;
 485:             StringBuilder builder = new StringBuilder(length);
 486:             for (int i = 0; i < length; i++)
 487:             {
 488:                 char ch = s[i];
 489:                 if ((ch & 0xff80) == 0)
 490:                 {
 491:                     if (ignoreAscii || IsSafe(ch))
 492:                     {
 493:                         builder.Append(ch);
 494:                     }
 495:                     else if (ch == ' ')
 496:                     {
 497:                         builder.Append('+');
 498:                     }
 499:                     else
 500:                     {
 501:                         builder.Append('%');
 502:                         builder.Append(IntToHex((ch >> 4) & '\x000f'));
 503:                         builder.Append(IntToHex(ch & '\x000f'));
 504:                     }
 505:                 }
 506:                 else
 507:                 {
 508:                     builder.Append("%u");
 509:                     builder.Append(IntToHex((ch >> 12) & '\x000f'));
 510:                     builder.Append(IntToHex((ch >> 8) & '\x000f'));
 511:                     builder.Append(IntToHex((ch >> 4) & '\x000f'));
 512:                     builder.Append(IntToHex(ch & '\x000f'));
 513:                 }
 514:             }
 515:             return builder.ToString();
 516:         }
 517:  
 518:         public static byte[] UrlEncodeUnicodeToBytes(string str)
 519:         {
 520:             if (str == null)
 521:             {
 522:                 return null;
 523:             }
 524:             return Encoding.ASCII.GetBytes(UrlEncodeUnicode(str));
 525:         }
 526:  
 527:         public static string UrlPathEncode(string str)
 528:         {
 529:             if (str == null)
 530:             {
 531:                 return null;
 532:             }
 533:             int index = str.IndexOf('?');
 534:             if (index >= 0)
 535:             {
 536:                 return (UrlPathEncode(str.Substring(0, index)) + str.Substring(index));
 537:             }
 538:             return UrlEncodeSpaces(UrlEncodeNonAscii(str, Encoding.UTF8));
 539:         }
 540:  
 541:         // Nested Types
 542:         private class UrlDecoder
 543:         {
 544:             // Fields
 545:             private int _bufferSize;
 546:             private byte[] _byteBuffer;
 547:             private char[] _charBuffer;
 548:             private Encoding _encoding;
 549:             private int _numBytes;
 550:             private int _numChars;
 551:  
 552:             // Methods
 553:             internal UrlDecoder(int bufferSize, Encoding encoding)
 554:             {
 555:                 _bufferSize = bufferSize;
 556:                 _encoding = encoding;
 557:                 _charBuffer = new char[bufferSize];
 558:             }
 559:  
 560:             internal void AddByte(byte b)
 561:             {
 562:                 if (_byteBuffer == null)
 563:                 {
 564:                     _byteBuffer = new byte[_bufferSize];
 565:                 }
 566:                 _byteBuffer[_numBytes++] = b;
 567:             }
 568:  
 569:             internal void AddChar(char ch)
 570:             {
 571:                 if (_numBytes > 0)
 572:                 {
 573:                     FlushBytes();
 574:                 }
 575:                 _charBuffer[_numChars++] = ch;
 576:             }
 577:  
 578:             private void FlushBytes()
 579:             {
 580:                 if (_numBytes > 0)
 581:                 {
 582:                     _numChars += _encoding.GetChars(_byteBuffer, 0, _numBytes, _charBuffer, _numChars);
 583:                     _numBytes = 0;
 584:                 }
 585:             }
 586:  
 587:             internal string GetString()
 588:             {
 589:                 if (_numBytes > 0)
 590:                 {
 591:                     FlushBytes();
 592:                 }
 593:                 if (_numChars > 0)
 594:                 {
 595:                     return new string(_charBuffer, 0, _numChars);
 596:                 }
 597:                 return string.Empty;
 598:             }
 599:         }
 600:     }
 601: }

Note: The source for HttpUtility is also available through the .NET Framework source release, but I actually found it easier to get a clean copy by decompiling.

The code above was formatted using Leo Vildosola's Code Snippet plugin for Windows Live Writer, which I just started using with this post.

Saturday, November 29, 2008

Impact of the .NET Framework on Software Installations

The size of the .NET Framework redistributable exploded with versions 3.0 and and 3.5. This creates some difficult choices for vendors of rich-client applications, as a lengthy or unwieldy installation experience can easily discourage non-technical users from using your product. The Paint.NET folks have recently put a huge amount of effort into streamlining their installation process for this very reason.

It's easy to find the total download sizes of the various .NET framework distributions. But the actual time to download and install each one can be difficult to estimate. To get a better idea of the actual bootstrap install times, I ran a number of test installations on a VPC with the following configuration:

VPC OS Windows XP Pro SP2 + virtual machine additions
VPC Host 1.7GHz P4 running Windows XP Pro SP3
Internet Connection 4.0+ mbps cable modem

Notes on the Test Configuration and Methodology

  1. The test VPC is likely slower than a typical Windows XP machine. Spot checks with a significantly faster VPC host seemed to reduce installation times by about 15 percent across the board. Considering the performance-dampening cruft that accumulates on the average consumer PC, the processing power available here probably isn't that atypical.
  2. My Internet connection is faster than a "typical" 1.5mbps broadband connection. But this seemed a non-factor, as the download speeds reported by the .NET installers never exceeded 600 kbps.
  3. Installation times varied by as much as 40-50 percent between identical runs. For example, the fresh 3.5 Client Profile install ranged from 11 to 16 minutes. It wasn't always clear what caused the performance variations, but the main culprit seemed to be dropped or slow connections between the installers and the download server. The test results below show the shortest time for each install, not the average time.
  4. I ran each installation documented below at least 3-6 times. Generally, I ran more fresh installs and fewer upgrades.

Here are the resulting installation times:

Previously Installed .NET Version

New .NET Version Installed

Reboot Requested?

Bootstrap Download + Install Time

Time Saved by Upgrade

None .NET 2.0 SP1 No 10 minutes NA
None .NET 3.0 No 25 minutes NA
None .NET 3.5 SP1 No 17 minutes NA
None .NET 3.5 SP1 Client Profile No 11 minutes NA
.NET 2.0 .NET 3.0 No 17 minutes 8 minutes
.NET 2.0 .NET 3.5 SP1 No 21 minutes -4 minutes
.NET 2.0 .NET 3.5 SP1 Client Profile Yes 18 minutes -7 minutes
.NET 3.0 .NET 3.5 SP1 Yes 17 minutes 0 minutes
.NET 3.0 .NET 3.5 SP1 Client Profile Yes 18 minutes -7 minutes

Notes on the Test Results

  1. In most cases, an existing .NET installation actually makes the installation significantly longer, with the exception being the 2.0 to 3.0 upgrade.
  2. 3.5 performance is considerably improved over 3.0, even for the full 3.5 distribution.
  3. A fresh install of the 3.5 Client Profile runs nearly as fast as a fresh 2.0 install.
  4. The 2.0 to 3.5 upgrade request a reboot for the Client Profile installer, but not when installing the full 3.5 distribution. Weird. And unfortunate.
  5. The 3.5 Client Profile installer is much nicer for non-technical users than any of the other installers.

Conclusions

If you're deciding whether to target .NET 3.0 or .NET 3.5, then 3.5 is a no-brainer--even if you need the full 3.5 distribution. If you can get away with the 3.5 client profile, then going with 3.5 is really a no-brainer.

The choice between 2.0 and 3.5 is more difficult, especially if most potential users are already on .NET 2.0. In this scenario, there would be no .NET install for users on 2.0 and a short (10 min) .NET install for users with no previous .NET installation. Upgrading to 3.5, on the other hand, would result in a much longer (18 min) install plus a reboot for users on .NET 2.0 (client profile install only). There are two mitigating factors that might lead you to consider going with 3.5 over 2.0 regardless of these drawbacks:

  1. As I mentioned above, the .NET 3.5 Client Profile installer is much nicer for end-users than any of the other installers, with minimal user-interaction required.
  2. Apparently Windows Update will soon push .NET 3.5 SP1 out to machines with .NET 2.0 already installed, so many users currently on 2.0 will not experience the long install plus reboot required for the 2.0 to 3.5 Client Profile upgrade. (I read this on one of the Microsoft blogs, but can't find the link at the moment.)

UPDATE: As one reader pointed out, the full .NET 3.5 SP1 framework is quietly installed any time you perform an upgrade, rather than the client profile. That explains why the upgrade installations took about the same time whether running the client profile or full framework installer. Here's a reference document that explains what happens with various OS's and upgrades. From that document:

NOTE: The .NET Framework Client Profile is targeted for Windows XP computers with no .NET Framework components installed. If the .NET Framework Client Profile installation process detects any other operating system version or any version of .NET Framework installed, the Client Profile installer will install .NET Framework 3.5 Service Pack 1.

Tuesday, August 26, 2008

Converting a Partitioned Table to a Nonpartitioned Table In Sql Server 2005

Several months ago while working with Sql Server 2005 partitioned tables for the first time, I discovered an interesting bug/hidden feature that doesn't seem to be documented anywhere: Adding a clustered primary key constraint can quietly revert a partitioned table to a nonpartitioned one. At the time I found this behavior quite annoying, but it actually came in handy today when I needed to change the data type of a column used in the table's partition scheme from smalldatetime to datetime. Microsoft's knowledge base article on modifying partitioned tables indicates only that you may collapse multiple partitions into a single partition. It doesn't provide any options for departitioning tables--other than dropping and recreating them from scratch, of course.

To demonstrate departitioning we must first create a simple partitioned table. To do so execute this Sql:

   CREATE PARTITION FUNCTION MyPartitionRange (INT) 
   AS RANGE LEFT FOR VALUES (1,2) 

   CREATE PARTITION SCHEME MyPartitionScheme AS 
   PARTITION MyPartitionRange 
   ALL TO ([PRIMARY]) 

   CREATE TABLE MyPartitionedTable 
          ( 
          i INT NOT NULL, 
          s CHAR(8000) , 
          PartCol INT 
          ) 
   ON
    MyPartitionScheme (PartCol)     

(If you don't understand what's happening in each of the steps above, read this tutorial for more complete instructions--see "Creating the Partitioned Table"). Execute this Sql to see a list of partitions for the new table (you should see three):

   SELECT *
   FROM sys.partitions
   WHERE OBJECT_ID = OBJECT_ID('MyPartitionedTable')

Now, let's say this table has been running in production for several months, has lots of data, and you realize you need to expand PartCol to a bigint. You can't change the PartCol data type with an alter table statement:

   ALTER TABLE MyPartitionedTable ALTER COLUMN PartCol bigint

The Sql above fails with the somewhat obscure error "The object 'MyPartitionedTable' is dependent on column 'PartCol'." This won't change even if you collapse multiple partitions into a single partition using ALTER PARTITION with the SPLIT and MERGE options (the approach recommended by Microsoft), because the table is still partitioned on PartCol. Instead, you can execute this Sql to departition the table completely:

   ALTER TABLE MyPartitionedTable
   ADD CONSTRAINT [PK_MyPartitionedTable] PRIMARY KEY CLUSTERED([i]) ON [PRIMARY]

You should now see just one partition for this table in sys.partitions:

   SELECT *
   FROM sys.partitions
   WHERE OBJECT_ID = OBJECT_ID('MyPartitionedTable')

Other important points to note:

  1. This only works if the primary key column is not included in your partition function definition.
  2. If you forget to include "ON PRIMARY" when creating the primary key you'll run into another obscure error: "Column 'PartCol' is partitioning column of the index 'PK_MyPartitionedTable'. Partition columns for a unique index must be a subset of the index key." Sql Server is trying to tell you that you can't create a clustered index on "i" because it's not used in your partition function. In other words, you could create a clustered index including both "i" and "PartCol" and still maintain partitioning.
  3. A non-clustered primary key won't departition the table, even if you specify "ON PRIMARY".
  4. Adding any clustered index should accomplish the same thing--but I haven't verified this.

Finally you are free to change the PartCol data type. You can also repartition the table if necessary by following the steps in this article.

Saturday, May 31, 2008

Video Scene Detection with DirectShow.NET

For some time I've been working on a video-related personal project. I'm using the fantastic DirectShow .NET library, which provides a nice C# interface to Microsoft's DirectShow C++ API. At one point some folks on the DS .NET forums asked about the scene detection algorithm I referenced in one of my forum posts. I promised to follow up with some sample code and explanations and--finally--here they are.

I've created a sample solution to demonstrate my scene detection algorithm. It's based on the DxScan sample available with other DS .NET samples on the DS .NET download page. My algorithm is not yet production code but has proven very reliable in my own testing. It is 100% accurate against my test video library, which is 600 minutes of actual sports video with 1,800 scene changes (including both night and daytime events) plus several short test videos created explicitly to strain the algorithm.

At a high level, scene detection involves the following steps:

  1. Randomly select 2,000 of the RGB values composing a single video frame. These are the values on which we'll perform a longitudinal (or cross-frame) analysis to detect scene changes for the entire duration of the video.
  2. Analyze the current frame:
    1. Calculate the average RGB value for the current frame. If the RGB values are unusually low or high we're detecting scenes shot in bright or dim light conditions and will need to raise or lower our scene detection thresholds accordingly.
    2. Perform an XOR diff between the RGB values in the previous and current frames. The XOR diff amplifies minor differences between frames (vs a simple integer difference) which improves detection of scene changes involving similar scenes as well as detection in low-light conditions where we tend to be dealing with lower RGB values.
    3. Calculate the average RGB difference between the current and previous frames. In other words, add up the XOR diff values from step 2.2 and divide by the number of sample frames.
    4. Calculate the change in average RGB difference between the current and previous frames. This is a bit tricky to understand, but it's critical to achieving a high level of accuracy when differentiating between new scenes and random noise (such as high-motion close-ups or quick pans/zooms). If the previous frame's change in average RGB difference is above a defined, positive threshold (normalized for light conditions detected in step 2.1) and the current frame's change in average RGB difference is below a defined, negative threshold, then the previous frame is flagged as a scene change. In simple terms, we're taking advantage of the fact that scene changes nearly always result in a two-frame spike/crash in frame-to-frame differences; while pans, zooms, and high-motion close-ups result in a gradual ramp-up/ramp-down in frame-to-frame differences.
    5. Advance to the next frame and repeat step 2.

I'll try to expand and clarify the above steps when I have time, but for now you'll have to read the code if you need to understand the algorithm in more detail. The only limitations in the current implementation (that I'm aware of) are the following:

  1. Dropped frames are interpreted as scene changes. This issue can be minimized in most applications by choosing a minimum scene duration and discarding new-scene events fired by the SceneDetector inside the minimum-duration window.
  2. Scene transition effects (fades, dissolves, etc.) are not supported and scene changes involving such effects are not detected.

If you encounter any other issues with the algorithm, I'd love the opportunity to see and analyze the video that broke it!

Thursday, February 07, 2008

Unit testing too difficult? Change your design

Lessons and processes from building construction, physical goods manufacturing, and other engineering disciplines are often misapplied to software creation. Nevertheless, these disciplines occasionally provide very useful analogies. One of these is the idea that a given design must accommodate more than just functional and aesthetic needs.

For example, designers of physical goods must consider how much it will cost to actually build what they're designing. The costs to procure materials, tool-up a factory, and train an army of workers are a major portion of the costs to market a physical good. Want to design a car you can sell for $20k? Skip the Italian leather seats. Forget about the brand-new, high-compression engine that would double factory tooling costs. Drop the independent rear-suspension.

One good thing about design-related manufacturing costs is that they are well-understood and naturally visible. They may be miscalculated, but there's little chance they'll be forgotten or ignored. This is not true in the software world. Instead, the cost implications of many design choices are invisible even to the designer—let alone the rest of the organization. One of these is the recurring cost to validate functionality as the software changes. When validation is 100% manual (meaning a person—whether a QA tester or developer—must explicitly execute and review the results of each test case) it becomes extremely expensive (in terms of time as well as money). (Not to mention that top-down, manual validation is simply too inefficient to exercise more than a small fraction of possible code paths and thus will allow many code defects to escape into the wild.)

We need ways to expose design-related validation costs early in the development cycle so they can be properly accounted for when planning and choosing features—and so the design can be altered when necessary to reduce those costs. We also need ways to minimize validation costs so our software can be tested thoroughly and still be profitably sold and supported.

Rigorous automated unit testing helps us both expose design-related validation costs and minimize those costs:

  • It exposes validation costs because we're forced to invest labor in creating automated tests at the time a feature is added or created, incorporating more of the long-term costs of the feature into the initial implementation schedule.
  • It minimizes validation costs because the labor invested in test creation returns benefits each time the software is modified for an indefinite period of time and dramatically reduces total validation costs.

Automated unit testing forces validation costs to become a first-class design consideration: designs that are not unit-testable are failed designs and must be replaced by designs that are unit-testable and still meet functional and performance goals.

Rigorous unit testing should change our mindset toward one of designing for testability. With this mindset we should

  • Focus the same level of energy and creativity on the design of our unit tests as on the components being tested.
  • Take the same care in organizing and maintaining our test code and projects as we take with the rest of our codebase
  • Believe that a software feature is inseparable from its unit tests.
  • Alter our designs as necessary to support unit testing in both personal and automated build environments
 
Header photo courtesy of: http://www.flickr.com/photos/tmartin/ / CC BY-NC 2.0