Friday, 31 July 2020

I am no Charles TDickensionary<>

- or a few TDictionary<> tricks.


One of my favourite classes in Delphi is TDictionary<> - might be because it resembles a database table which can have composite keys, and I have always been a ClientDataset/VirtualTable/FDMemTable fan-boy :)

I do not get to use the memTables that much i recent years, and the RTL generic collections also got better - while not on par with third party collections like Spring4D - which I also not get to use much.

I recently had a task where I had to transform some code that was collecting data from an external source, that provided data in a way that multiple iterations had to be done, before each row/item was complete - since data was coming in subsets per field/property. TDictionary<> to the rescue.


There is probably better ways of doing this - but this was the first thing that came to my mind, and it worked out well on the fairly complex set of data and the constraints in the form they came.

I have come up with a unrelated but similar very small scenario - where the key to the data is Date, CharityId and values are Count, EmpId and Sum.

That should illustrate a case where Charities have collectors by their employee id, that daily collect money. So the id of the collector for a given Date+Charity is separated from the donations by a "dimension" value.

Stupid sample scenario, but that is what fast ideas give you - when the real life is too complex as examples :D - here the input data to transform:

IntDate, CharityId, dimension, value
0, 0, 0, 1
1, 0, 0, 2
0, 1, 0, 3
0, 0, 1, 23
1, 0, 1, 1.99
0, 0, 1, 12
0, 0, 1, 5
1, 0, 1, 10.59
0, 1, 1, 20.59

..and output we want as below, and sorted date and charity:

IntDate, CharityId, Count, EmpId, Sum
0, 0, 3, 1, 55
0, 1, 1, 3, 20.59
1, 0, 2, 2, 12.58

So types are defined as:

TKeyRecord = record
  IntDate: Integer;
  CharityId: Integer;
  class function New(intDate, charityId: Integer): TKeyRecord; static;
end;

TValueRecord = record
  Count: Integer;
  EmpId: Integer;
  Sum: Double;
end;

TDictionaryData = class(TDictionary<TKeyRecord,TValueRecord>)
  procedure AddData(const intDate, charityId, dim: Integer; const value: Double);
end;

Create and populate


The AddData procedure does try and find the value record by its key record and if that fails it adds a new value for the new key, or updates the fields in the value record given by the dimension in that iteration - the thing I worked on in real life had +20 dimensions and some were nested - terrible structure :)

procedure TDictionaryData.AddData(const intDate, charityId, dim: Integer;
  const value: Double);
var
  KRec: TKeyRecord;
  VRec: TValueRecord;
begin
  KRec := TKeyRecord.New(intDate, charityId);
  Self.TryGetValue(KRec, VRec);
  if dim=0 then VRec.EmpId := floor(value);
  if dim=1 then
  begin
    VRec.Count := VRec.Count+1;
    VRec.Sum := VRec.Sum + value;
  end;
  Self.AddOrSetValue(KRec, VRec);
end;

Normally one would check on the boolean result of TryGetValue - but here we want to do something regardless what it is. So in this sample input data is then added as:

Data := TDictionaryData.Create();
...
Data.AddData(0, 0, 0, 1);
Data.AddData(1, 0, 0, 2);
Data.AddData(0, 1, 0, 3);
Data.AddData(0, 0, 1, 23);
Data.AddData(1, 0, 1, 1.99);
Data.AddData(0, 0, 1, 12);
Data.AddData(0, 0, 1, 5);
Data.AddData(1, 0, 1, 10.59);
Data.AddData(0, 1, 1, 20.59);

Add or accumulate


TDictionary<> does have TryGetValue and AddOrSetValue as shown above, but I often miss an AddOrAccumulate function, that either adds new data or accumulates some of the values there - and since our "Value" is a record and not a simple type, we would also have to write some code anyway.

For this example I did just create the AddData method on the class, but otherwise could a class helper like the one shown here be helpful for simple values:

TDictionaryHelper = class helper for TDictionary<string, Double>
  procedure AddOrAccumulate(const Key: string; const Value: Double);
end;
...
procedure TDictionaryHelper.AddOrAccumulate(const Key: string; const Value: Double);
var
  oldValue: Double;
begin
  if Self.TryGetValue(Key, oldValue) then
    Self.AddOrSetValue(Key, oldValue + Value)
  else
    Self.AddOrSetValue(Key, Value);
end;

Sorting the data


A TDictionary<> can't directly be sorted - so one way of doing it is by putting it into an TArray<>, defined as:

DataArray: TArray<TPair<TKeyRecord,TValueRecord>>;

and then sort that using a TComparer with an anonymous method:

DataArray := Data.ToArray;

TArray.Sort<TPair<TKeyRecord,TValueRecord>>(DArray, TComparer<TPair<TKeyRecord,TValueRecord>>.Construct(
    function (const L, R: TPair<TKeyRecord,TValueRecord>): Integer
    begin
      Result := CompareValue(L.Key.IntDate, R.Key.IntDate);
      if Result = 0 then
        Result := CompareValue(L.Key.CharityId, R.Key.CharityId);
    end)
    );

Accessing the full data


The split of the data in a key record and a value record might seem a bit messy, but an easy way to traverse through the data, whether it is in the TDictionary<> or the TArray<> - were we sorted the data - is just to loop through the TPair<TKeyRecord,TValueRecord> record - like this to fill our memo:

pair: TPair<TKeyRecord,TValueRecord>;
...
Memo1.Lines.BeginUpdate;
for pair in DataArray do
begin
  Memo1.Lines.Add(pair.Key.IntDate.ToString+', '+pair.Key.CharityId.ToString+', '+
  pair.Value.Count.ToString+', '+pair.Value.EmpId.ToString+', '+pair.Value.Sum.ToString);
end;
Memo1.Lines.EndUpdate;

So when having the pair, both parts can be accessed by either Key or Value.

All done and good


I will put the full source code of the sample on my GitHub - but I might add some extra features and another post before that.

I hope that there might have been something useful in here, and not too many errors and babbling, since I did put myself on a stopwatch - and the sample did change a bit along the way :D

I did use 10.4 doing this post - and I should say that the new LSP-based code-insight is very enjoyable compared to what we have had for many years - just that you can just type a word and all results where the word is contain - not only starting with - is very helpful. Sorry went off-topic. Good night.

“It was the best of times, it was the worst of times.”
― Charles Dickens, A Tale of Two Cities

/Enjoy - and stay safe.

No comments:

Post a Comment