Splunk join with different sourcetype

10/5/2023

| lookup lookuptable.csv trimmed_name ``` match trimmed_name ``` | stats count values(*) AS * BY ID ``` not by name ``` | eval ID = if(sourcetype = "SourceA", ID_a, ID) ``` use ID as universal field name ``` pull additional fields from lookup if trimmed_name in SourceA is found in the lookup.Ĭombining these, the search should be something like index="myindex" (sourcetype="SourceB" type=INFO) OR (sourcetype="SourceA" dateadd fields of interest in SourceB if ID_a in the former matches ID in the latter and.pull all fields of interest in SourceA.The last filter will further exclude any "name" that appear in more than one event in SourceA, which is very unlikely.The stats command's groupby will exclude any data from SourceB. When you rename ID_a as ID, you also erase any value of ID that comes from sourceB.| stats count values(*) AS * values(date) values(title) values(env) BY name | fields ID, name, title, description, solution, date, env Index="myindex" (sourcetype="SourceB" type=INFO) OR (sourcetype="SourceA" date< env=envA) From what I gather, I have events for the query but nothing displays when I try to get data from SourceB and SourceA at the same time. Lookup: department (other fields can be omitted, assuming we can add specific fields back as needed).SourceB: title_id, title, description, solution (other fields can be omitted, assuming we can add specific fields back as needed).At a minimum, final results needed per source are:.These 2 ID_a values is what we to find in Source B and return title_id, title, description solution values for. This means from 100 events in SourceA, dedup'd by ID_a, results in 2 ID_a. But we only need to return fields\values where the dedup'd SourceA ID_a exists. SourceB will have multiple values of ID.We can dedup SourceB down to the fields above to get a finite list. When deduping title_id, values of title, description, solution, etc. are dedup'd and become unique. This is why I called SourceB a knowledge base. Characteristics of SourceB: there are duplicates of ID, title_id, title, description, solution.SourceA ID_a and SourceB ID are the common field between the two datasets.

Characteristics of SourceA: name is the only unique field, ID_a will be duplicated across name depending on date field, env is duplicated across the dataset, trimmed_name is a field created from trimming name.SourceA gets updated weekly, therefore in the existing query I have earliest=-7d to exclude previous data.Lookup has fields (trimmed_name, department, etc.).SourceB has fields (ID, title_id, title, description solution, etc.).SourceA has fields (ID_a, name, date, trimmed_name, env, etc.).Below is what I am trying to accomplish, any direction and help would be greatly appreciated. I have been scouring the community and other boards but for the life of me cannot create a SPL query to get the results I need.

0 Comments

Splunk join with different sourcetype

Leave a Reply.

Author

Archives

Categories