Posts

Showing posts with the label data

Synthetic Data Generation and Java Faker

Image
Testing is often hard. Mostly because we need data to perform some forms of testing. For Unit Tests often you can fake your own data pretty easily. However, as we move to Integration Tests, E2E Tests, Big Data Data Science Discovery, Product(PO) Acceptance Testing, PO Discovery we might have a scaling issue. For sure you could build a tactical solution where you hardcoded all domains/data-sets of your company but this approach is highly coupled with systems database and is hard to maintain. So there are the tools and startups on the land of Synthetical Data Generation where they use metadata to understand databases and generate data because of constraints like size, type of data, verifications(CPF for Brazil, SSN for the US). This approach is hard but I believe is the future. Today I want to show something in that direction, but much more simple and maybe a starting point? Which is Java-Faker  which is similar to Faker.js . 

Why Big Data is Hard?

Image
Yes, Big Data is hard. It's not hard only because of the number of technologies a good data engineering team needs to master technologies such as Spark , Flink , and Kafka Streams (Batch and Streaming), Hadoop , HDFS , and Hive if you have a DW legacy(most likely you do) and the Data Science part of it with Discovery and Execution at Scale. There are needs for different kinds of storage and Design/Modeling, and thats, not even the hard part. The technology landscape gets bigger and bigger as time pass. We have many specializations such as Frontend/Mobile engineering, Backend Engineering, Architecture, DevOps (Which is a movement, not a department, but all companies decide is a role, so you know what I mean), QA(a dying one? ), Product, Management and Data Engineering which often has Data Scientists working with Data Engineers. To some degree, Data Engineering and Data Science have the same issues as Product has today. Unfortunately, the product folks still too much about project m...

Kotlin DSL

Image
DSL is an expressive way to declare data and even business rules or configurations.DSLs can either be internal or external. Kotlin has interesting support for building internal and external DSLs. 12 days of Kotlin is a great way to learn more about kotlin-idiomatic solutions. Today I will be exploring the 12 days of kotling posts  in regards to DSL we will understand how they work and what options does kotlin provide to build rich DSLs.  So I recorded a video and also coded the sample of the 12 days of kotling post. So Let's get started.  The Video The Code https://github.com/diegopacheco/kotlin-playground/tree/master/kotlin-dsl-fun Cheers, Diego Pacheco

Micro-Workers: A flavor or Microservices?

Image
Microservices are about ISOLATION . I need point out some different aspects here, keep in mind the word: Microservices is compose by 2 words, micro, thats where you got the isolation, minimal business unit, great marriage with REST in sense of RESOURCE and several other things i mention on previous post. There is another word call: service. This is where SOA come to play, there is lots of people talking BAD about SOA now a days, but they dont realize MSA is SOA.  Service is not a web service, so that word has more meaningful than people imagine. SOA is about principles , its about great foundations that enable Service Orientation. That`s is an Architecture but its also a way to think also called SO(Service Oriented). Microservices make a lot of sense if you are coming down from a monolith its very DDD like if you pay attention.