Friday, May 28, 2010

SharePoint SPList.LastItemModifiedDate And SPWeb.LastItemModifiedDate Are UTC Time

In SharePoint API, a SPList object has a LastItemModifiedDate property that indicates the last item modified time. The MSDN description for this property is: Gets the date and time that an item, field, or property of the list was last modified.

We use this property to validate the memory cache in one project, but the cache is not properly updated when the list item is changed. Finally we figured out this property returns a UTC time not a local time. To correct the issue, simply use the ToLocalTime method:
DateTime listLastUpdate = list.LastItemModifiedDate.ToLocalTime();
Similarly, SPWeb.LastItemModifiedDate is also UTC time, and you need to convert it to local time manually when you use it in the code.

In fact, all DateTime values stored in SQL Server, such as list item's "Modified" time, are in UTC format. SharePoint will convert it back to local time:
SPListItem item = list.Items[0];
DateTime modifiedTime = Convert.ToDateTime(item["Modified"]);
I don't see any reason why this LastItemModifiedDate property is using UTC format. My guess is that the time is not parsed by database value, instead is created something like this in code behind:

DateTime listModifiedDate = new DateTime(value from database);
Is it a bug only existing in SharePoint 2007? I checked the latest SharePoint 2010 RTM release (build number 14.0.4762). To my surprise, they're still UTC time. Maybe MS should document them or give explanation for any such inconsistent stuff inside API.

Saturday, May 15, 2010

SQL Server Vs MongoDB - Performance Perspective

Relational database such as SQL Server would be slower than NoSQL solutions. That's understandable. Relational database has its focus and ACID has its cost. But I didn't expect their performance gap is so big until I did some real tests.

The test machine is the same I tested on MongoDB last time:
  • OS: Windows Server 2008 64-bit SP2
  • CPU: Intel i5 750
  • Memory: 6G
  • DB: SQL Server 2008 Enterprise edition SP1.
Similar I did the test on MongoDB last time, I create a simple test database with only three columns:
  id    -- uniqueidentifier (Clustered index)
key -- varchar(40) (Unique non-clustered index)
value -- varchar(40) (no indexing)
MongoDB test code was modified to test SQL Server. I ran the console and was eager to see the results. But the first test case of inserting 1-million records in local database seemed to be never completed.

The insert operations were just too slow. Is there anything going wrong with my machine? I installed all Windows/SQL Server important updates, and disable almost all unnecessary processes, deflagment disk, and redid the test. But the result is still very disappointing, less than 400 insert per second.

Then I tried a few options:

1. Change id to be non-clustered index: no big difference
2. Change id type from uniqueidentifier to varchar(40): no big difference
3. Add an extra identity integer column (primary key): no big difference
4. Change key/value from varchar(40) to char(40): no big difference
5. Change key's index fillfactor to 10, 30, 60, 80, 100: slightly difference among them
6. Do insertion using SQL Script:
    DECLARE @id uniqueidentifier
DECLARE @Count int, @Max int
SET @Count = 0
SET @Max = 1000000
WHILE @Count < @Max
BEGIN
SET @id = NEWID()
INSERT INTO testTable (id, key, value) VALUES (@id, @id, @id)
SET @Count = @Count + 1
END
no big difference
8. Use stored procedure: improve a little
9. Change Key's unique non-clustered index to non-clustered index: improve a little.
10. Change database recovery model from full to simple: improve a little

The maximum insertion speed could reach around 900 per second after all kind of tuning in my local machine. That's not even close to MongoDB, and I didn't do any extra work to achieve 200000+ insertions/sec.

Googled and found that actually many people are having the same problem. The article SQL Server Slow Performance on Insert compiles a few resources on the topic of slow insertion in SQL Server.

Anyway I was patient enough to wait the 1-million records being inserted. Then I did the rest of the test with following results (local test):
Insert: 900/Second
Update: 1500/Second
Search on clustered-indexed column: 9000/Second
Search on non-clusterd indexed column: 2000/Second
Search on Non-indexed column: 27/Second
Besides the non-indexed column search (table scan), only the search on clustered-indexed column in SQL Server is comparable to MongoDB's.

Sunday, May 02, 2010

Test MongoDB In Windows Environment

NoSQL movement becomes more and more popular. Nowadays most NoSQL product environments are Unix-based. MongoDB is one of few solutions that provide Windows installation and .NET provider. This exercise evalutes MongoDB's performance in Windows platform.

The test machines' configuration:
Local tests and remote tests are both examined in this exercise. Local tests are conducted inside one physical box. MongoDB server and client process are running in separated machines in remote tests, and connection between server and client machines are 100 Mbit/Sec.

A simple document with key (unique-indexed) value (non-indexed) pair is used for testing. All fields (including internal _id field) will store GUID value:
{
_id: "12345678-1234-1234-1234-123456789012", // Internal field
key: "12345678-1234-1234-1234-123456789012", // Unique-indexed
value: "12345678-1234-1234-1234-123456789012" // Non-indexed
}
Manually handling "_id" field without using Oid format is not recommended. The reason of doing that in our tests is efficiently do update and delete operations without extra search.

Local tests are conducted inside one physical machine. MongoDB server and client process are running in separated machines in remote tests, and connection between server and client machines are 100Mbit/Sec.

Local test results:


(All in document actions per second, 1M=1000000)




























































Document number
1M 2M 4M 8M 10M
Insert 32210 29540 27160 22100 20530
Update 18560 18160 18010 17900 17540
Delete 12660 12510 11980 11610 11550
Search on _id 13340 13330 13350 12990 12130
Search on key 9190 9280 9300 9110 9010
Search on value 7.2 3.6 1.8 0.9 0.7


Remote test results:





























































Document number 1M 2M 4M 8M 10M
Insert 31450 30700

22420 18810 17600

Update 18180 18160 17400 18100 17830
Delete 12550 12270 11730 11680 11530
Search on _id 2410

2440

2410

2420

2420

Search on key 1470

1470

1450

1450

1440

Search on value 7.2 3.6 1.8 0.9 0.7


Findings:
  1. Fastest operation is data insert.
  2. Search on _id is faster than search on other indexed field.
  3. Search on indexed field seems in constant time, not proportional to size of the data set.
  4. Search on non-indexed field shows very poor performance, and the time spent is proportional to the size the data set. This is similar to a table scan scenario in a relational database.
  5. There's no big gap in insert/update/delete between local test and remote test.
  6. Remote search is significantly slower than local search.
By digging into source code of MongDB-CSharp driver, I found that insert, update and delete functions in the client side are implemented by one-way IO writing without any acknowledgement, i.e. keep sending data to TCP NetworkStream. Streamed data can be efficiently buffered in both client and server side. From client point of view it's asynchronous process. That explains why there's no big difference between remote and local tests in batch insert/update/delete scenarios. On the other hand, read or search (find related functions) requires two-way communication. The client sends a request to the server then waits the response to be completed. The server process time plus the round-trip network delay are all counted for one request action. That's why remote search are significantly slower the tests conducted in local box.

I thought the relatively low speed of indexed search in our test results could be caused by my single-threaded test application. So I modified the code and did the tests with 2, 4, 6, 8 and 10 threads. But surprisingly the performance did not improve at all.

The test code is listed below. It is simple and may be not accurate enough. But it gave me an idea how it performs in general with Windows environment.

Test code:
using System;
using System.Collections.Generic;
using System.Configuration;
using System.Diagnostics;
using System.Threading;
using MongoDB.Driver;

public class Program
{
struct TimeUnit
{
public int NumberOfRecords;
public long MilliSeconds;
public TimeUnit(int num, long ms)
{
NumberOfRecords = num;
MilliSeconds = ms;
}
}
const int MaxDocuments = 1000000;
const int NonIndexSearchCount = 1000;
const int ThreadCount = 1;
const string DBConnectionString = "Server=127.0.0.1";
const string DBName = "TestMongoDB";
const string DBCollectionName = "TestCollection";
static string[] testContent = new string[MaxDocuments];
static object lockObject = new object();
static Dictionary<string, List<TimeUnit>> resultTimes = new Dictionary<string, List<TimeUnit>>();

static void Main(string[] args)
{
// Initialize data
for (int i = 0; i < MaxDocuments; i++)
testContent[i] = Guid.NewGuid().ToString();

// Start Threads
List<Thread> threads = new List<Thread>();
for (int i = 0; i < ThreadCount; i++)
{
int max = MaxDocuments / ThreadCount;
int insertStart = i * max;
int searchStart = insertStart + max / 2;
int searchEnd = searchStart + max / 10;
Thread thread = new Thread(() => TestMongoDB(insertStart, searchStart, searchEnd, max));
thread.Start();
threads.Add(thread);
//ThreadPool.QueueUserWorkItem(o => TestMongoDB(insertStart, searchStart, searchEnd, max));
}
Console.WriteLine();
// Wait threads to be completed
foreach (Thread thread in threads)
thread.Join();

// Display results
Console.WriteLine("Results:");
foreach (var key in resultTimes.Keys)
{
List<TimeUnit> times = resultTimes[key];
Console.Write(key + ":");
long totalTime = 0, totalNumber = 0;
foreach (TimeUnit st in times)
{
totalTime += st.MilliSeconds;
totalNumber += st.NumberOfRecords;
Console.Write(st.MilliSeconds + " ");
}
Console.WriteLine(string.Format("{0} speed is : {1}/sec", key, totalNumber * 1000.0 / totalTime));
}
Console.WriteLine("Done");
Console.Read();
}

/// <summary>
/// MongoDB test. Document format:
/// {
/// _id, // internal field
/// key, // indexed field
/// value // non-indexed field
/// }
/// </summary>
static void TestMongoDB(int insertStart, int searchStart, int searchEnd, int max)
{
// Init MongoDB
Mongo mongo = new Mongo(DBConnectionString);
IMongoDatabase db;
IMongoCollection documents;
Document doc = new Document();
Stopwatch watch = new Stopwatch();

mongo.Connect();
db = mongo[DBName];
documents = db.GetCollection(DBCollectionName);

// Setup Index on key field
lock (lockObject)
{
if (!documents.MetaData.Indexes.ContainsKey("key"))
{
Document indexDoc = new Document { { "key", 1 } };
documents.MetaData.CreateIndex(indexDoc, true);
}
}

// Insert data
watch.Start();
for (int i = insertStart; i < insertStart + max; i++)
{
Document newdoc = new Document();
newdoc["_id"] = newdoc["key"] = newdoc["value"] = testContent[i];
documents.Insert(newdoc);
}
watch.Stop();
AddTimeUnit("insert", new TimeUnit(max, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB insert {0} records: {1} ms",
max, watch.ElapsedMilliseconds.ToString()));
watch.Reset();

// Search on id field
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-id", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on id field over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Search on indexed "key" field
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["key"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-key", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on indexed field over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Search on non-indexed "value" field
watch.Start();
for (int i = searchStart; i < searchStart + NonIndexSearchCount; i++)
{
doc["value"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-value", new TimeUnit(NonIndexSearchCount, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on non-indexed field over {1} records: {2} ms",
NonIndexSearchCount, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Update test
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
//var doc1 = documents.FindOne(doc);
doc["key"] = testContent[i];
doc["value"] = i.ToString();
documents.Update(doc);
}
watch.Stop();
AddTimeUnit("update", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} update over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Delete test
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
//var doc1 = documents.FindOne(doc);
documents.Delete(doc);
}
watch.Stop();
AddTimeUnit("delete", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} delete over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
}

private static void AddTimeUnit(string key, TimeUnit st)
{
lock (lockObject)
{
if (resultTimes.ContainsKey(key))
{
List<TimeUnit> times = resultTimes[key];
times.Add(st);
}
else
{
List<TimeUnit> times = new List<TimeUnit>();
times.Add(st);
resultTimes.Add(key, times);
}
}
}
}