Google Cloud

Intermediate Level

Medium Difficulty

gcp data engineer Intermediate Practice Exam: Medium Difficulty 2025

Ready to level up? Our intermediate practice exam features medium-difficulty questions with scenario-based problems that test your ability to apply concepts in real-world situations. Perfect for bridging foundational knowledge to exam-ready proficiency.

20 Medium Questions

Scenario-Based

Exam-Similar Style

Next: Advanced Level Back to Beginner

Your Learning Path

Level Up Your Skills

What Makes Intermediate Questions Different?

Apply your knowledge in practical scenarios

Medium Difficulty

Questions that test application of concepts in real-world scenarios

Scenario-Based

Practical situations requiring multi-concept understanding

Exam-Similar

Question style mirrors what you'll encounter on the actual exam

Bridge to Advanced

Prepare yourself for the most challenging questions

Intermediate Questions

Medium Difficulty Practice Questions

10 intermediate-level questions for Google Cloud Professional Data Engineer

AI Generated

Medium Difficulty

Design Data Processing Systems

A retail company is building a data platform where events from mobile apps must be processed in near real time for fraud detection, while also supporting daily batch reporting. The team wants a single pipeline design pattern that minimizes duplicated logic and works for both streaming and batch. Which approach should they choose?

Ingest and Process Data

A logistics company needs to ingest telemetry from 50,000 vehicles. Messages arrive out of order and can be delayed by up to 10 minutes. The company computes per-vehicle rolling metrics every 1 minute and must ensure correctness when late events arrive. Which solution best addresses these requirements?

Ingest and Process Data

A media company ingests clickstream events into BigQuery using streaming inserts. Analysts complain about occasional duplicate events caused by client retries. The company needs to reduce duplicates without building a complex deduplication pipeline. What should they do?

Ingest and Process Data

A financial services company runs nightly ETL on Dataproc that reads from Cloud Storage and writes outputs to BigQuery. The job occasionally fails midway, leaving partially written outputs that break downstream reporting. The company wants reruns to be safe and not produce duplicates or partial data. What is the best approach?

Store Data

A company is building a data lake on Cloud Storage for semi-structured logs. They want cost-effective storage, the ability to evolve schemas, and fast SQL analytics in BigQuery without ingesting everything into native BigQuery tables immediately. Which approach best meets these needs?

Store Data

An IoT platform stores time-series device data and needs very low-latency lookups for the most recent readings per device as well as high write throughput. Analysts occasionally run large aggregations, but the primary requirement is operational read/write performance. Which storage solution is the best fit for the primary workload?

Prepare and Use Data for Analysis

A healthcare organization uses BigQuery for analytics and must enforce that only the compliance team can see direct identifiers (e.g., full name), while analysts can see de-identified data. They want to enforce this at query time without duplicating datasets. What should they implement?

Prepare and Use Data for Analysis

A data team maintains a large BigQuery table partitioned by event_date. Query costs are increasing because analysts often filter by customer_id but not always by date. The team wants to improve performance and reduce scanned data without changing analyst behavior significantly. What should they do?

Maintain and Automate Data Workloads

A company has multiple upstream systems producing files to Cloud Storage. A Dataflow job reads new files, transforms them, and loads results into BigQuery. The team needs automated orchestration with dependency management, retries, and visibility into failures across steps (file arrival, transform, load). Which solution is most appropriate?

Maintain and Automate Data Workloads

A streaming pipeline on Dataflow writes aggregated metrics to BigQuery. After a recent change, the pipeline occasionally lags and autoscaling adds workers, increasing cost. The team wants to detect regressions early and troubleshoot bottlenecks (e.g., hot keys, backpressure, slow sinks) using managed observability. What should they do?

Mastered the intermediate level?

Challenge yourself with advanced questions when you score above 85%

Try Advanced

FAQ

Google Cloud Professional Data Engineer Intermediate Practice Exam FAQs

gcp data engineer is a professional certification from Google Cloud that validates expertise in google cloud professional data engineer technologies and concepts. The official exam code is PDE.

The gcp data engineer intermediate practice exam contains medium-difficulty questions that test your working knowledge of core concepts. These questions are similar to what you'll encounter on the actual exam.

Take the gcp data engineer intermediate practice exam after you've completed the beginner level and feel comfortable with basic concepts. This helps bridge the gap between foundational knowledge and exam-ready proficiency.

The gcp data engineer intermediate practice exam includes scenario-based questions and multi-concept problems similar to the PDE exam, helping you apply knowledge in practical situations.

Continue Your Journey

More resources to help you pass the exam